David Klein's Corner: December 2010

Friday, 17 December 2010

Oakton hiring in India - and there's a billboard to prove it!

My employer Oakton is currently doing a big recruitment drive in India... this reminds me of when one of my previous employers made a 50-metre long company logo out of vinyl hoping it would appear on Google maps. We spent all day putting it out and the plane didn't even fly over!

DDK

How to Use BAPI_CUSTOMER_FIND to search for Customers in SAP

BAPI's are a useful way of retrieving data in SAP for testing purposes - even if they are not the recommended way of retrieving data from SAP from external systems (you should use Enterprise Services for that). I had a problem with the quirks of one of the BAPIs today - the SAP BAPI "BAPI_CUSTOMER_FIND" wouldn't allow me to perform a wildcard search on the Customer Name (the KNA1.NAME1 field). Exact matches worked fine. Turns out there is a field "PL_HOLD" in the input parameters that has to have a value of "X" in order for wildcard matches to work at all.

So the process is:

Work out the field and table name that you want with SAP transaction /nse11
Test the BAPI:
Make sure that MAX_COUNT is 200, or the desired maximum number of return values. Use 0 for an unlimited number of return results.
Make sure PL_HOLD is X to enable wildcard matching
Put the TableName (e.g. KNA1 for customer master), field name (e.g. NAME1 for customer name) and the wildcard in the SELOPT_TAB table
Run your BAPI to perform the search.

Of couse you wouldn't need to worry about this if you are just using SOAPUI and the CustomerSimpleByNameAndAddressQuery enterprise service as it has no such flags to enable the wildcard searching - but that's another story.

DDK

Tuesday, 14 December 2010

Entity Framework "4.5" coming out in Q1 2011 as a Feature Pack/Update

The Community Tech Preview 5 (CTP5) of the Entity Framework Feature update (aka EF 4.5 - name to be confirmed) was just released for download last week according to the ADO.NET team blog - see http://blogs.msdn.com/b/adonet/archive/2010/12/06/ef-feature-ctp5-released.aspx

This has facilities to create databases based on:

A "Code First" approach where the code defines the model, which in turn can generate the database for you. This involves the ability to define your model using the "Fluent API" rather than an entity diagram.
A "Model First" approach where the normal edmx EF Designer is used to create the Model and the database can be generated from that.

I will be also looking forward to any performance improvements the guys at MS are going to incorporate into the RTM build.

DDK

Monday, 13 December 2010

How to spot a fake SanDisk SDHC Card

I recently had the misfortune of purchasing a fake 32GB SDHC card for my HTC Desire. I only found out a few weeks after my purchase when I started to notice corruption in some of the mp3 files I was copying over to my card. Once I copied them over, Files on the Android-based phone would sit on the card for a minute or two and then disappear. To confirm it was a fake card, I tried to format it and then copied some large files onto then off the card. This copy process failed when trying to read the files back off the fake media.

Apparently, the dealers in China often rip the cards out of old GPS machines and relabel them with a fake serial number and SanDisk logos.

After looking into the topic, it turns out there are some telltale signs that give a fake card away:

The serial number is on the back, not the front so fraudulent sellers can display the card in photos without giving their game away.
The SDHC logo is not clearly printed and may appear blurred or smudged.
The white writing on the card is a straight white rather than a muted white colour.

See below for a photo of the fake and the real card side-by-side. The real card is on the left, the fake is on the right hand side:

The only guaranteed way of getting a real SDHC card is by dealing with a local dealer who has a legitimate address in Australia and who you can follow up through the Australian Competition and Consumer Commission (ACCC) if you are sold a fake.

You have been warned!
DDK

Performance and the Entity Framework (EF) 3.5 - How to Lazy Load Fields within the one table

There are 2 different camps when it comes to application performance:

Functionality should be delivered first and we can optimize performance later
We should constantly strive to test and improve the application performance throughout development. - this is particularly important when dealing with new or unproven technologies.

While it is good to be constantly proactive in improving performance, it can sometimes be a hinderance to the project and delivering functionality that the client can actually use. Clearly Microsoft has taken the first approach with the first version of the Entity Framework (EF 3.5). As a hybrid approach between these two, I strongly believe in the use of a Proof of Concept based on core use cases for every project aimed and proving integration approaches and the expected performance of the end system. This helps you develop some golden rules/rules of thumb for that particular implementation and can help you to avoid larger-scale issues down the track.

Performance approaches aside, one of my clients recently had an issue with performance of a system based on the Entity Framework 3.5. Many of the issues in general with EF performance are well documented and I will not detail them here - however there are some golden rules that apply to any database-driven application:

Minimize the amount of data you bring across the network
Minimize network "chattiness" as each round-trip has an overhead. You can batch up requests to resolve this issue.
JOIN and Filter your queries to minimize the number of records that SQL needs to process in order to return results.
Index your DB properly and use Indexed (SQL Server)/Materialized (Oracle) Views for the most common JOINS
Cache Data and HTML that is static so you don't have to hit the database or the ORM model in the first place
Denormalize your application if performance is suffering due to "over-normalization"
Reduce the number of dynamically generated objects where possible as they incur an overhead.
Explicitly loading entities rather than loading them through the ORM (e.g. via an ObjectQuery in Entity Framework) when the ORM outputs poor performing JOINS or UNIONs.

One thing that I noticed in this application that violated Rule 1 - was the use of a EF entity "SystemFile" which had a field called "Contents" that held large binary streams (aka BLOBs) and was pulling them out from the database every time the Entity was involved in a query. The Entity Framework doesn't support lazy loading of fields per se - but it does support loading of entities separately.

Using this concept, the most obvious step seemed to me to be:

Remove the "Contents" field from the "SystemFile" entity so it didn't get automatically loaded when the EF entity was referenced in a LINQ2E query.
Create an inherited entity "SystemFileContents" that just had the contents of the file so the application can load it up only when needed.

This was fine - but then my Data Access Layer then wouldn't compile and I received the following error:

Error 3034: Problem in Mapping Fragments starting at lines 6872, 6884: Two entities with different keys are mapped to the same row. Ensure these two mapping fragments do not map two groups of entities with overlapping keys to the same group of rows.

After a little investigation, I found there are a few different approaches to this error:

Implement a Table Per Hierarchy (TPH) as described at http://msdn.microsoft.com/en-us/library/bb738443(v=VS.90).aspx. This would mean I could just make some database changes and move the file binary contents into a separate table. After that I could just make the parent "SystemFile" class an abstract one, and only refer to 2 new child classes "SystemFileWithContents" and "SystemFileWithoutContents"
I could simply split the table into 2 different entities with a 1:1 association rather than an inheritance relationship in the Entity Framework Model.

Option 2 was the best in terms of minimizing code impact as this application had been in development for over a year. To this end, I used the advice here regarding adding multiple entity types for the same table.

http://blogs.msdn.com/b/adonet/archive/2008/12/05/table-splitting-mapping-multiple-entity-types-to-the-same-table.aspx

The designer in Visual Studio 2008 doesn't support this arrangement (though the designer in Visual Studio 2010 does as per http://thedatafarm.com/blog/data-access/leveraging-vs2010-rsquo-s-designer-for-net-3-5-projects/) - so you have to modify the Xml file directly and add a
"ReferentialConstraint" node to correctly relate the 2 entities:

We add the referential constraint to it to inform the model that the ids of these two types are tied to each other:

<Association Name="SystemFileSystemFileContent">
  <End Type="SampleModel.SystemFile" Role="SystemFile" Multiplicity="1" />
  <End Type="SampleModel.SystemFileContent" Role="SystemFileContent" Multiplicity="1" />
  <ReferentialConstraint>
    <Principal Role="SystemFile"><PropertyRef Name="FileId"/></Principal>
    <Dependent Role="SystemFileContent"><PropertyRef Name="FileId"/></Dependent>
  </ReferentialConstraint>
</Association>

This reduced the load on SQL and the web server as it didn't have to drag across the data dynamically on each call to the SystemFile table anymore. Any performance improvement must be measurable - so the team confirmed this with scripted Visual Studio 2008 Load tests which has a customer-validated test mix based on their expected usage of the system.

DDK