Archive for the ‘database’ Category

TechTip: Dbunit Export from Jetbrains DataGrip

I am an avid test driven development (TDD) advocate nowadays, with a pragmatic slant of course, looking to bullet proof the features that I deliver to ensure that they do what is expected, and work out edge cases.

A big challenge to testing is generating of test data, which is needed to setup some integration test work flows. I have been using Jailer (http://jailer.sourceforge.net/) to generate data from existing tables in a Dbunit format which can then embed in my test dataset xml files.

This is a challenge due to the mapping of relationships by Jailer (a neat feature by the way). So while working Datagrip, the database IDE of choice, we were struck by how to export different formats when looking at a table. This solution would allow us to leverage available filtering and searching features, to nail down the datasets that needs to be exported.

On contacting the support team through Twitter (https://twitter.com/0xdbe/status/853900122828222465/photo/1), the recommendation was to modify the existing XML groovy script to generate DBunit XML, following the steps at https://www.jetbrains.com/help/datagrip/2017.1/extending-the-datagrip-functionality.html

And well an hour later below is a groovy script to do just that can be found at https://gist.github.com/ssmusoke/ca4c55b4e52de97acb99a590644a677f

The code was not being well rendered hence the move to a Gist

Alternate Approach to Legal Independent Election Tallying

The Uganda elections are more or less over with less than 6 hours for the Uganda Electoral Commission (EC) to announce the results for the presidential elections.

Given all the time on our hands, with no social media, the team at Styx Technology Group designed the following alternative approach to independent electoral vote tallying for future elections that provides inbuilt mechanisms for audit and verification of results.

The primary data sources for the process are:

  1. Official EC list of polling stations and voters per polling station
  2. Photos of the signed election tally sheets from each polling station. To ensure that the photos are not tampered with and provide an audit trail:
    • Each photograph has to be taken with information on the camera, the GPS coordinates of where the photo was taken, date and time when the photo was taken which is available in many cameras that share it using the Exchangeable Image File Format (EXIF)
    • Two separate photos of the tally sheets have to be taken by different cameras
    • The cameras taking equipment may be registered beforehand to provide validation of the source of the information
    • The signatures of the returning officers and stamp must be clear and visible in the photo

The architecture for the technology solution is as follows:

  1. Web based solution accessible via any browser. Due to poor Internet connectivity in many areas of the country, an Android app would be provided to assist in data collection, then data sent once the user gets into an area with Internet.
  2. The field officers who capture the photos would also be provided with an option of entering the candidate vote tallies.
  3. In the tallying center, candidate vote tallies are entered from the photos received and vote tallies entered by data clerks. In order to reduce errors the following approach would be used:
    • The clerks are randomly assigned photos as they come in
    • The tally for a station must be entered correctly by two separate data entry clerks, then approved by a supervisor. This process is formally called the two-pass verification method or double data entry.
  4. All correctly entered data is shared with the rest of the world for download and analysis.

This system is mission-critical having to be available for the entire vote counting period of 48 hours,  so the architecture includes the following paths for data collection:

  1.  Multiple access IP addresses and domains for the website in case some are blocked off
  2. Any data collected via the Android app can be sent via email to a dedicated tallying center address. To ensure that only data from the app is received and not changed in transit, encryption is used.

The inspiration came from a quote by Ghandi “Be the change you wish to see in the world”, disproving the myth that there is no local capability to design and implement such solutions and most of all that such solutions have to be complex.

Looking forward to hearing your thoughts and suggestions…

Opinion: Microsoft Demise being Overrated but they are still a Major Player

This post was intially a comment on this Mashable post Why Microsoft Is Being Left in the Dust but I figured it was too long and needed its own post

The demise of Microsoft has been predicted continuously over the last 20 years however what most commentators forget is that Microsoft innovates best when its the under-dog:

a) Browsers – beat Netscape to a pulp and was the browser kng for the next 10 year do I hear IE 6, and today after steady decline IE has 50% market share and holding steady, watch out for a rise with IE 10

b) Desktop – anybody remember OS2

c) Office Productivity – Wordperfect, Lotus 1-2-3 anyone

d) Exchange – Novel Groupware

e) Networking/Active Directory – Novel Netware, hey Windows NT, how many people use Windows boxes for Active Directory and File and Print against Samba

f) Xbox – do I hear Nintendo DS/PS3 is just picking up but Sony is suffering

g) Databases – Watchout Oracle/DB2 SQL server is deep within the departments and getting many enterprise features

h) Antivirus – McAfee/Norton/Kaspersky watch out for the free MS Security Essentials I know I have never looked back

i) Corporate Intranets – Documentum watch out for Sharepoint

j) Open Source – top 10 contributer to Linux, being a developer PHP/MySQL support for windows excellent

k) Programming Languages – .NET has caught up, seems like each and every NGO/UN department is running licensed Windows/IIS/SQL Server/Sharepoint/Exchange

l) Android – who has made the most money from licensing patents, Samsung/HTC/Barnes and Noble/Motorolla are all paying

m) Mobile – they may be late to this game, but PCs are here to stay and compliment smartphones/tablets, so there is no danger there, Linux is not yet mature enough for the desktop and Mac OSX not available

n) Research Labs – apart from IBM which uses research as a competitive edge, HP closed down their, Google are still in the game, the only other company in this big time seems to be Redmond

o) Healthcare – with those US automation dollars flowing down

p) Enterprise Applications – they may have faltered but this market is growing with Navison and Dynamics

q) Channel – who else is bigger and better at harnessing this resource, these are an extension of the sales force

r) Development tools – I do not use them but they are the envy of many a developer

Please share your thougts and opinions

Databud – Startup Weekend Kampala – April 27 to 29

I will be attending my first startup weekend in Kampala, on April 27, 2012 to April 29, 2012 and well I thought that why not share my pitch and get advice on how to refine it. No idea is great unless shared right?

In the absence of #opendata in Uganda, there is a whole lot of data locked up within individual government systems, documents, in non standard formats which needs to be unlocked, the data set free so that it can grow (Data Bud) – the data buds and grow

A picture is worth a thousand words right – below is the whole concept

Data Bud Concept

Data Bud Concept

Comments, additions, advice? Looking forward to seeing ya this weekend

The Poor Man’s Job Queue

Not all software development projects are treated the same, some have access to modern tech Virtual Private Servers (VPS), Zend Server (http://www.zend.com/products/server/), Memcached, Gearman and all the other goodies I can only dream of. You have a box with LAMP, and you cannot install anything else.

This is an example of how we got around a limitation, using available tools. Problem: I have a list of tasks to execute within my application, however I need to ensure that the tasks are executed and completed, but some are more important than others, and the execution may slow down the performance of the box we are running on. Well in this case we were loading 6 different types of XML files which were FTPed into a location on the box regularly, every 35 minutes and had to be loaded in a specific order. This was further complicated by the fact that we had to reload historical data in case of issues (1 weeks worth of uploads ~ 2100 files) without interrupting the current loading processes.

The approach used the following components:

a) Job Queue – based on the Zend Server Job Queue but simplified for our needs (see data model of tables below)

Job Queue Data Model

Job Queue Data Model

b) Queue Loader Script – loads the jobs into the job queue by scanning the location containing the files to be loaded and adds the files to the queue (since the queue is a database table, duplicates are discarded without errors) This keeps this file simple and honest

c) Job Executor Script – reads a message from a queue, reads the message body which contains the file name to be processed, could be made more complex

d) Queue Loader Cron Job – calls the Queue Loader Script to add new files to the queue

e) Job Executor Cron Job – calls the job executor script. This job has no effect if a lock file exists, and is not expired which means the script is valid and running. However if the lock file is expired, it means that the process crashed, so the lock file is deleted, a new process is started with a lock file. Basically this keeps the job executor script running indefinitely as long as there are messages to process. 

Please feel free to leave a comment on what your experiences are with similar problems. 

Doctrine2 Day 3 – Proxies, Associations, Relationships

Well if you are following this series, then by know you are aware that we have the validators setup, and we are almost ready to go. Well not quite so. I ran into an issue with proxies and class loaders which took a while to resolve, but what I did was:

a) Changed the Zend Framework-Doctrine2 integration to the Bisna integration (https://github.com/guilhermeblanco/ZendFramework1-Doctrine2) – note the additional configurations before

b) Learnt that DO NOT SAVE ANY MODELS IN SESSION OR CACHES due to Proxy auto loading issues

c) Develop unit tests for the model validations and saving as you go along because they will save you as you shift things around . I am currently trying to save an association to the database but due to the tests that I have running I can test out the different association mappings without issues because I track the changes using my unit tests.

Now onto relationships, well what I found was a follows:

  1. I have been having a tough time dealing with relationships from Doctrine 1 which autoloaded relationships, however with with the Doctrine2 data mapper, you have to auto load the associations your self
  2. In Doctrine 2, you cannot define the foreign key column and relationship as this creates problems for the ORM mapper, so since I need to access the foreign key value for example personid, and the related object person, my approach has been as follows:
    • Added a person instance and relationship mapping to the person as per the Doctrine
    • Add a setter setPersonID() which loads the Person from the database using the provided ID and sets it to the person relationship provided
    • Added a getter getPersonID() which obtains the id of the person for use on the screen
  3. As recommended avoid bi-directional relationships where possible, try to keep them as uni-directional as possible.

Next I will be implementing a nested hierarchical structure using the tree nested set implementation at http://www.gediminasm.org/

Doctrine 2 – Day 2 – Model Validation using Symfony Validator Service in Zend Framework

It is just Day 2 of my experiences in the trenches, regular work, had kept me from this but I managed to get some time to keep digging. As a followup to my Doctrine 2 – Day 1 – Commentary from the Trenches. The models are up and configured, and the unit testing is setup following steps from Michelangelo van Dam’s presentation (http://www.slideshare.net/DragonBe/unit-testing-zend-framework-apps).

One of the major process that we are implementing with this migration is detailed unit testing which was tougher with the Doctrine1. With the unit test infrastructure setup, the next item on the agenda is model validation. From previous experience (before Doctrine 1) and with Doctrine1 it is critical to be able to specify validation rules using annotations without having to write PHP if statements. Being a ZF person, the first step to look was the ZF validator classes. While they seem to be well integrated with the forms, they would prove to be too verbose to use for validation of models, since I also need to be able to specify multiple validators per column, and this would not cut it for me.

Next stop was Symfony2 validator service (http://symfony.com/doc/current/book/validation.html) provides validators with annotations support. So that was the easy part, the hard part was yet to come integration. The integration followed the steps below:

a) Add Symfony Validator service to library folder, easy, just download a package from https://github.com/symfony/Validator

b) Register the Symfony Validator annotations – this is where I had problems (more later)

c) Add the Symfony Validators to the model properties

d) Add validation code which needed a validate() method in the base class from which all entities are derived, which requires @ORM\HasLifecycleCallbacks (so that the model can hook into the lifecyle call back models) @ORMPrePersist and @ORM\PreUpdate for the validate method to ensure its called before the models are saved (first time) or updated. More details on the Doctrine annotations can be found at http://docs.doctrine-project.org/projects/doctrine-orm/en/latest/reference/annotations-reference.html

So what problems did I face.

1. From the Doctrine documentation, you use the annotations directly, for example @Id, however experience has shown that you need to namespace and alias them see note from Symfony integration http://symfony.com/doc/current/book/doctrine.html . So I had to change all the Doctrine annotations to use @ORM namespace

2. The default annotation driver only supports a single namespace so you will need to update as per the pastebin below

http://pastebin.com/embed_iframe.php?i=feiKsVxg

Now I am a happy camper, got my models working using Symfony validations, we only have to write code for custom validations which happens only about 20 – 30% of the time.

As a parting shot, the Symfony team and community have done a great job for PHP, why because they provide standalone components (similar to Zend Framework), but each of their components can be used without the rest of the framework. As I was investigating the validator usage and issues, I found a thread where Fabien Potencier and team were discussing annotation support in Symfony. However they also noted that Doctrine Commons had better support, so they stopped the work on Symfony annotation support and just used the services of the Doctrine team. This is how all software development should be done, and is a torch to the rest of us. I am a convert, and happy to be a proud member of the PHP community.

Update May 8, 2012

I had promised to provide some sample snippets of what I am using for the integration with Bisna integration and Symfony validation that I ended up using so here we go https://gist.github.com/2638526 The files are as follows:

a) application.ini – there is nothing special here from Bisna. Includes the cache configuration, prod/staging/dev/test environments all of which inherit from production

b) index.php – from the public folder or htdocs – this may not be perfect but it works and am looking for ways to simplify it

c) Bootstrap – this is the file we use, highlight:

– Storing the entity manager instance in the Zend_Registry, we have a utility method which loads it from the registry and another which also provides a connection from the entity manager so its fully encapsulated

– intialization of the Zend_DB adapter, we need this since we are using the Zend_Session to save the sessions in the database

– the last config is for other resources we use. We have a dependency on the Zend_Registry class as it hides a lot of complexity

d) Document.php – a sample model class

– it extends BaseEntity which provides automatic getters and setters through the __call method, some required fields like id, datecreated, last  update date and last updated by (not all models extend this class only those which need those auditing fields)

– Why do we have getCategoryID(), setCategoryID() for the category property instead of mapping the categoryid field from the database see the next post in the series https://ssmusoke.wordpress.com/2012/03/25/doctrine2-day-3-proxies-associations-relationships/

– Unlike the Doctrine2 defaults we do not use tablename_id but rather tablenameid for the foreign keys so we have define them in each relationship.

Please let me know what I can do to make this any clearer, thanks for reading

%d bloggers like this: