Sunday, June 19, 2016

Reboot all the thing


The golden rules of running infrastructure on AWS:
  1. Instances fail all the time, just reboot them.
  2. Design everything to fail, and handle it consistently.
  3. AWS has a large community where you can ask questions and find a solution to everything; failing that refer to rules (1) and (2)
67383210.jpg

When we launched Interana, we decided to deploy our solution on AWS as a managed service so our customers could benefit from our pre-built orchestration. Interana's proprietary analytics database runs with a replication factor of 1.  This is in converse to transactional NoSQL stores which tend to replicate data for achieving HA. So occasional reboots themselves seems not so bad in an analytics-purposed system. Cost and performance benefits are realized when managing many TB's of data.

Initially when we deployed in AWS, we observed a large number of node outages.  We performed many reboots over and over again until we were faced with a recurrence rate of 5 outages per 100 instances per day.  This seemed excessively high.  Left unattended, nodes would stay down for hours.
Something was up.

Indicative of a bigger problem on AWS

The instances would become unreachable, usually signaled from Amazon Status Check alerts:
“You are receiving this email because instance i-xxxxxx in region US - N. Virginia has failed an instance or a system status check for at least 5 period(s) of 60 seconds at "Tuesday 10 March, 2015 22:01:32 UTC". You can view status check details about this instance by navigating to the EC2 console”
Auditing our setup, we were using stock Ubuntu LTS 14.04 AMI from AWS.   At first, we thought a simple upgrade of Ubuntu to the latest version may clear things.
sudo apt-get update && sudo apt-get upgrade.
Nothing changed. So we looked at the syslog and saw the following:
Mar 10 20:39:42 ip-10-0-0-56 dhclient: DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 7 (xid=0x7f84a245)
Mar 10 20:39:49 ip-10-0-0-56 dhclient: No DHCPOFFERS received.
Mar 10 20:39:49 ip-10-0-0-56 dhclient: No working leases in persistent database - sleeping.
Ahh. A DNS failure. What happens is every 1800 seconds, the DNS server is consulted for any possible changes in resolution with regards to this host.  When there are problems with the DNS master, the current host will become unreachable.
However, these incidents seemed isolated to a single day and to AWS congestion issues that resolved themselves. This was clearly not the root cause.

Unintended Consequences

Another problem was that instances would get “stuck” trying to reboot. I guess we should stop/start the instance. However, our usage of i2.xlarge machine type is largely ephemeral which leads to 100% data loss.  This forces us to restore from disk, which takes many hours.
Even worse, a side-by-side restore (i.e. from a EBS Snapshot) can take an entire day as cold access to EBS is almost 1/10 of its peak values.
The most obvious approach is to replicate hot data but i2.xlarge are the most expensive machine in the AWS family, costing almost 3K per server per year.  Running 100 TBs of spares for a replication factor of 2 would be prohibitively expensive.

Monitoring Mashup

As we saw nodes continuing to go pear shaped, we needed to add additional monitoring in a hurry and start tracking all the outages.
Amazon comes with Cloudwatch built in. While that seems great, it doesn’t really go beyond just simple metrics and outage notices.  It also gives statistics from the XEN hypervisors point of view, not what is on the machine.  There are pros and cons to this, but mainly if XEN Hypervisor thinks everything is fine it won’t provide any further insights when you have an outage.
Next we installed a port ping utility called nping.  Nping is useful for checking ports without requiring ICMP, which is disabled by default on Amazon for security purposes.
sudo apt-get install nmap
nping -p 22 -c 1
I wrote a wrapper script that ran distributed nping in a loop that resulted in the following log that is easy enough to parse via linux command line
20150325T200634|10.0.0.84|import6|Starting Nping 0.6.40 ( http://nmap.org/nping ) at 2015-03-25 20:06 UTC
20150325T200634|10.0.0.84|import6|SENT (0.0013s) Starting TCP Handshake > 10.0.0.84:22
20150325T200634|10.0.0.84|import6|RECV (0.0143s) Handshake with 10.0.0.84:22 completed
20150325T200634|10.0.0.84|import6|
20150325T200634|10.0.0.84|import6|Max rtt: 13.017ms | Min rtt: 13.017ms | Avg rtt: 13.017ms
20150325T200634|10.0.0.84|import6|TCP connection attempts: 1 | Successful connections: 1 | Failed: 0 (0.00%)
20150325T200634|10.0.0.84|import6|Nping done: 1 IP address pinged in 0.01 seconds
20150325T200634|10.0.0.82|import4|
Check for outages
cat ~/tmp/monitor_nping_all_20150325T200351.txt | grep -Pe "Failed: 1" | cut -d'|' -f2,3 | sort -n | uniq -c
14 10.0.0.48|data0
95 10.0.0.49|data1
83 10.0.0.51|data2
13 10.0.0.52|data
47 10.0.0.53|data5
31 10.0.0.54|data6
11 10.0.0.55|data7
21 10.0.0.56|data8
46 10.0.0.57|data9
108 10.0.0.58|data10
26 10.0.0.59|data11
After getting data we can process for outages using python scripts also:
Check for default response times
cat /monitor_nping_all_xxx.txt | cut -d'|' -f4 | tr -s ' ' | grep "pinged" | grep -woPe "\d+\.\d+" | python -c "\
import sysimport pprint
import pprint
from collections import defaultdict
allf = [float(line.strip()) for k,line in enumerate(sys.stdin)]
hist = defaultdict(int)
for valf in allf:
hist[valf] += 1
for resp,count in sorted(hist.iteritems()):
print '{:0.2f} = {}'.format(resp,count)"
avg=0.01,max=1.06,min=0.00, count=2398
We also leveraged Datadog, a popular cloud-based monitoring stack.  Once installed, the story became clear.
4_networkdrop
ImportNodes-networkTest
While AWS monitoring would alert us an hour later many times before the initial outage was detected by our monitoring.  The graph above should be a set of nicely peaked lines with a small taper.  Instead, we see outages as soon as network bandwidth drops from peak.  This problem wasn’t just a few nodes going out, the network cards were dropping at all times as load increased.
Finally, we added a SCP large file to all nodes in a distributed fashion. That would cause the network to drop immediately.  Reproducing became easy.

Enhanced is the new “Working”

After some wrangling AWS support folks were able to monitor the Xen Hypervisor and found that network hardware would never receive the data as it died on the OS/driver. While we never got the full picture, the problem was due to likely Jumbo Packet support, which are TCP packets greater than 1500 bytes MTU. Instead Enhanced Networking has to be turned on.  What this is a dedicated link to your EBS that proves far better transfer (up to 50%) and latency as well. It uses SR-IOV, a feature that allows multi-tenancy hosts on a machine to utilize DMA style transfer.
After we finished installing this, the transfer graph looked happy:
NetworkHappies.png

Built-Tough, SR-IOV Tough.

After going through the install process and waiting on Ubuntu’s canonical to provide a fix, we decided to build the AMI’s and make them public for all to use.  These are based on a LTS Ubuntu 14.04.1 LTS. They should be immediately patch with security after deployment, i.e.
sudo apt-get update && sudo apt-get upgrade
RegionAMI
US-EAST-1 (Virginia)ami-30ebc55a
US-WEST-1 (N. California)ami-6f04720f
US-WEST-2 (Oregon)ami-440bed24
EU-WEST-1(Ireland)ami-68fb4d1b
EU-CENTRAL-1(Frankfurt)ami-4a899126
AP_NORTHEAST-1(Tokyo)ami-7f213611

Sometimes they come back

The driver uses linux DKMS support, which requires a current kernel header to compile the network driver.  So when doing a dist-upgrade, do not forget to add the kernel headers or your network will revert to the packaged VIF driver and enhanced networking will turn off.  We had this happen a few times before we realized the driver was not installed.  Note the last command is the one that really is important to determine which driver is in use.
Install headers for current OS
dpkg -l | grep "linux-headers-`uname -r`"
update-initramfs -c -k all
Check if driver is available
modinfo ixgbevf | grep -Pe "version:\s+2.16"
Check if driver is actually in use!
ethtool -i eth0 | grep -E '(driver: version: 2.16)'"
Another problem faced was segmentation offloading causing the host ssh connection to timeout on non-enhanced networking hosts (i.e. using the vif driver).   The following message appear in /var/log/syslog manytimes:
xen_netfront: xennet: skb rides the rocket: 19 slots
We later found this bug against the Ubuntu source tree and a workaround https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1317811
Turn off segmentation offloading on hosts using VIF driver:
sudo ethtool -K eth0 sg off

Conclusion

When building out a AMI, AWS provides you with a lot of tools and good set of base AMI’s to start with.  However at all time ensure that the hardware is validated before deploying to production. Also, when restart/rebooting machines, try to do as much root cause investigation early on before problem becomes malignant and spreads to all your infrastructure. Use all monitoring tools available to get a complete picture.

Thursday, February 18, 2016

How to get an Honorary Roaming Architect Degree

Back in grade 6, we had reading club (stochastic remember that).  The purpose was to transform into Roboreader that could finish the most amount of books in a month.  Then you get a cute little magnet that you could put onto your fridge for you parents could brag to all your parents (while your sibling got the same one for trying hard, and you made fun of him until he cried).

I also remember almost finishing the perfect season.  We had math drills once a week.  10 math questions in one minute, I earned the nickname: Terminator. In the Last month, i had a brain fart and slipped on a 2x3 = 8.  We had learned powers that day, bit flipped somewhere in my head.  Latter in college there was Keener Bingo and the unix Beirdos which i realized us engineers are bored out and needs labels to help cluster like-minded people.

The aftermath: Can't remember a single book I read (Charlotte's web, was that a pig or spider?) and after reaching pinnacle of 12x12 i didn't want to go further in chances of lowering my average.  (I again sat during the final softball game to preserve my .900 batting average 15 years latter, sorry Joe).   I also routinely go to the Beirdos for help for Perl,  and of course those wonderful vi-shorcut sheets.

So back on topic, what do you need to do to earn this title?  BTW its not Roman Architect - a person who designs stuff in Fortran claiming its the top mathematical system in the world (they are lurking around, beware!).  I suppose when I'm 60 i won't remember a single journey or an object from a class, but we have blogs now right?!?  Blogger don't get bought out by Micro-AARP!

Honary Doctrate In Travel'n & Architect'n
  • Must have done some architecting in 5 different countries.  Note off-shoring your design does count. and we do TRANSFER credits for if the country has split due to civil unrest (Czechoslovakia), but its limited to last 50 years (sorry california).
  • You must of been employed by 5 different companies,  your own ventures count, but aliases or domains of the same ventures do not.  Working = 1 year min.  Revenue = optional.
  • Must of designed systems in least 3 different sectors, i.e. banking, finance, military, social, media etc.. (no facebook, twitter, squared in are not sectors).
  • Must of attempted to learn 5 different languages ( sign language does not count ).  An attempt means you got a reply, toll-free and automated services do not count.
  • Must of been primary source for 1 data breach involving foreign data security practices.  Note fixing and reporting the breach, optional.  Distribution of breach is highly discouraged.
  • Taken 2 completely hacks which broke at least one of following,  patent, copyright, reverse-engineering, and obfuscated to something you sold of as original work. 
  • Must of been recruited at least 3 foreign tech workers ending up in a financial stipend.   Off-shoring does not count (see above) and bonus if was during a rescission.
  • Must of participated in 2 complete disaster of a project, one had to be at least government related and costing tax-payers money.
  • You've tried karokee'ing at least 2 other languages other than English. Note singing to your boarding wards, eligible daughters do count.
  • You've done a blog in 2 other laungages other than English, only to give up once realized they pay in centavos/per/click, not centos/per/click. Zut Alors, C'est Domage!
  • Must be on two patent disclosures.  Not following is excluded : fonts/colors/fusion.  but we accept as long as your name shown.  Social widgets are accepted.



So there it is, we'll be providing the Honary Doctrate In Travel'n & Architect'n pending underwriting from one our out-sourced eastern sponsors  

Meet these and get a badge (sorry not magnet, but likely to be a widget).

How can you get validation?  Resumes... hell no.  Your blog of course, after all who else is going to provide you with an honorary degree, your not Bill Crosby after all.
Are some of these personal and get you in hot water?. do what i do, wait until things blow over and we can all laugh about it.   After all your Roaming Architect degree is a lifetime adventure.









Tech : Microsoft Why all the hate

Its amazing to how much we've evolved in the 2000's with the following buzzwords Open Source, Web APIs, Cloud Computing, Mobile Computing, Commerce, Social, Geo-location becoming part of lingua-franca.

But one word and the company is all but forgotten and left for dead: Microsoft and Productivity.

The very first OS i worked on was ms-basic 5.1 and doshell. I hacked up my first visual basic program based using the  GORILLAS.BAS demo (Angry Birds anyone?) latter used in a Turbo Pascal Hockey emulator (yes all we Canadians luv hockey).

Now before I begin my list, I will have to say, by the time I was in university, we all used to hate Microsoft.   It was simple, they were the biggest baddest, software company out there, with there seemingly infallible leader, Bill Gates (which after his philanthropist career, seems all that much harder to hate).  The software interns came back very cocky and half repossessed,  they offered the crappiest pizza during the recruiting sessions, and notoriously had the titles for every lame job (SDET are still around!) and worst you had to mail them and beg for a Platform SDK dvd, so they could keep tabs on you.  So much hate that my very first job we participated in creating a new advertisement to post for new hires somethine like this as the tag line:

<Want to generate next generation function enterPlayerNumberOne(name) function for Solitare 2.0? Don't become a microserf, join us for a real challenge>

My first commercial product was in Visual Studio C++,  using their Document - View, thanks to old one big UI hook event pump I prototyped it quick (3 days) complete with serial driver and a multitasking processing layer. It fell as quick (5 months latter) due to inability to decouple UI from the model and constant redesign of UI. I cursed at the absolutely having the idealist Java around the door, but could never leverage it due to MS continuous tampering and forcing everyone on to their Visual Studio MFC (multifunctional crap) APIs.  Even then it was way faster, and how could to I explain to my clients "Oh thats the java that makes it slow and ugly, but underneath its all OO!).  

I also recall another death struggle with corporate powers, that old Microsoft freebie syndrome.  Sourcesafe, the "free and functional" Microsoft version control.  Well we did an old pc-world shutout, and it finished last in every category.  People gasped when making code vanish in to thin air using it.  Nevertheless I was singled out for forcing a $500/seat version control system into the company and not giving positive reviews for the Microsoft's ugly daughter.   This was quite often the case from upper management, its from Microsoft and its FREE, what else do u wanT!


Anyways, I digress, So here's the things we owe to them, in my time

1) Office UI : Quite simply the first graphical word office suite, most productive, and integrated, OLE support

2)  Windows User Experience Guidelines.  Quite simply one of the most follow UI standards at the time.

3) Direct 3d : Pretty much a center of all 3d games for the last decade and half (okay it didn't power Wolfenstein)

4) Bill Gates :  Before Steve Jobs, the single most compelling man in Tech.   His Humbleness, giving, quiet and in-corruptable nature, made it easy to want to him to represent us.

5) Windows 95: Simply the first UI that was adopted by the masses.  Yes it was slow but until that command line was out of the reach for most people.

6) Windows NT :  Well before google, someone came up with the idea to house big ole mainframes on with commodity hardware and portable user space code.  IMHO pushed unix into what linux is today

7) Importance of Internet: Though microsoft has suprinsgly done very poorly in ever faction of mobile computing, Bill Gates predicted in 1995

8) Productivity of the United States :  From 1980 to year 2000, US held a great advantage due to accessible computing



Okay so there is probably a list double this size of why people don't like them, but why kick someone while there down.



Hadn't it been for the Anti-Trust suit in the early 2000's where would they be today?

Wednesday, April 4, 2012

R&D to Colloborative Solutions

As I look at the state of funding for big government project like Space, Telecom and Alternative Energy, I realise there has been a fundamental shift away from the Centralised R&D model.  Now this word is the grandfather of the entire tech industry started.  Western Governments recognised the importance of it and create Incubators and re-reimbursements for tax to stimulate growth.  Virtually everything was considered R&D that touched technology up until 2000.

Now in the two-thousand-ten's, we are still in the beginning of the cloud revolution. I know I'm going to sound like I've missed the boat, but I think the cloud is penetrated less then 5% of households, and I see it being 80% by 2020. 

Which begs the question, with all the API's that can do complicated things like machine learning, semantic analysis, store your data, delivered on high performing clusters out of the box, what should I do?  Plus its unit tested and has 1000's of comitters on GitHub. What is R&D in this context.?

R&D, was the process of connecting software and hardware together and deliver an engineering solution to compromised set of requirements. To do this you would need on-site resident mathematicians, PHDs, full research team, hardware specialists, operations and personnel.  Routinely teams on SINGLE projects were up to 100 large.  It would take 3-5 years to deliver infrastructure software.  Your software likely was the backbone for many business processes.  Your hardware was proprietary, and you had Macgyver trying to solder capacitors back on to save your last signal processor, and  John 'Hannibal' Smith trying to string machines together through layer 2 HDLC.  All in all, you NEEDED experts in every area to be successful.  Patents were prevalent.

You can see where I'm going with this.  Now my average day consists of hanging out on stackexchange, coderanch.com, mashable.com and google trying to find solutions to packages available through opensource and apis. Oh look, some one from russia posted a library on github highlight fields using javascript.  I have 30 books on pretty much every solution I need, but can I spend 1 hour looking for that book, and another 1 hour trying to read a chapter anymore?

We are in the era of Collaborative Solutions.  I have to maintain breadth of knowledge in 10 functional areas, do capability matrix on the choices of lets say, a caching paradigm, search engines, corroborative filtering, event messaging back-ends, nosql and sql dbs, mvc's and even IDE.  And when I'm asked to execute, I have to take the best of breed, and deliver a system that will work in a RAD environment, be maintainable and scale to a million users in the first 6 months.   If I have doubts, i have to maintain relationships with algorithm specialist, java architects, big data, cloud solutions experts and take them for a beer for some advice.  If I don't have a contact, I look for a local meetup.com group and try to hit them up for advice.  If needed we'll get some consultants on retainer.

An example is in Java frameworks, right of the bat you create beans which are pooled to provided threading capability for concurrency.  However if you want to run a single process in a pool, it actually needs to be CONFIGURED to go slower, which sit he complete 180 from how this problem would of been approached.  You ASSUME concurrency right of the bat.  If you don't you'll have 100's of database connections open and left to rot.

So the focus has changed.  Most of use are in solution delivery.  Its nostalgic to think about how life was before web and collaboration has changed our thinking.  Out with books, in with blogs.  Out with research, in with mashups.  Out with system engineering, in with evolutionary prototyping. 

As for myself, I fought this for a few years.  I finally came to the realisation that this model was the better for all of us.  As software becomes more commoditize, I can profit on my efficiency in delivery.  At the end of the day, the consumer only spends so much money, and regardless of how cool we thing something is, its only going to amount to a $1.99 sale on an APP store.   So what we give up in Research is replace by Product Solutions?

For new grads this is much harder.  When I started I thought it was difficult that I could not hone my techniques in learning how electrical phenomenon like QPSK, gold codes, LFSR, Raleigh Fading model, I had to take for granted and write specs based on books and advice.  For a new grad that is even a bigger transition, as I've seen the CS schools have tried to stick to fundamental algorithm solving, and not big architecture and practicality.   Of course we need very good developers, and lots of them, however it would be great if we could see a course in "Solution Delivery"!



Saturday, February 11, 2012

Ubuntu Login problems : Getting ch$wned

After leaving my computer on for the night, I awoke to a frozen screen.   My laptop is power by Ubuntu 10.10 and has some problems with hibernate at times.  So I proceeded hard power down to reboot.  Only when it came up I got a blank screen.  Oh crap did my video card die?  Disk, are you corrupt?

So it it may be something screwed something up last night via my hadoop installation, but not sure what.  I also was suspicious it could be a disk-hw issue.  The following is a log of what I had to do to bring it back up.

1) First when installing ubuntu, you get a recovery mode installed that allows you bypass all the gnome stuff.  I used this to get to the recovery menu.

I picked "root login with network only" to get a bash shell
Next i typed in
$ ifconfig eth0 up
$ startx

It logged in into ubuntu desktop under my root account.  I ran
$touch somefile.txt 
Also df, mount, dmesg, to check if there were hw errors that cause the drive to mount read-only.  I found none.  At this time I could of ran fsck, but didn't look like a simple corruption so I decided to skip consistency checking and keep it in my back pocket


2) Now I tried to logon via my bash to see if it was the gdmsetup (the login window) or the login itself
$ login jag
No directory, logging in with HOME=/ 
Cannot execute /bin/dash: Permission denied

Ouch, so it can't seem to find my home directory I think, which is set by /etc/passwd.
Opening that file and I changed three different accounts and changed login shells, /bin/sh, /bin/dash and /bin/bash.  All accounts failed.
Now the clues I have, root login is happy and no user account can login.  Likely permissions.  So I check the /home/user folders, and perform a chown -R jag:jag /home/jag

3) Desperate, and 2 hours gone in mucking around, I searched and found this:
http://linuxgazette.net/issue52/okopnik.html

Though 10 years old, the flow chart of the linux login is still relevant.  What it told me is that the LOGIN DID work, and really only thing wrong is permission to the shell /bin/sh.

So I checked permission on /bin/sh and its librarys
$ ls -alF /bin/sh
lrwxrwxrwx 1 root root 4 2011-11-03 18:36 /bin/sh -> dash*
$ ls -alF /bin/dash
-rwxr-xr-x 1 root root 105704 2010-06-24 13:02 /bin/dash*

$ ldd /bin/dash
     linux-vdso.so.1 =>  (0x00007fff2c793000)
    libc.so.6 => /lib/libc.so.6 (0x00007fbe75206000)
    /lib64/ld-linux-x86-64.so.2 (0x00007fbe755ae000)

everything was 755 or better.  Nothing wrong here it seems.  i go through permission of all the lib folders /lib /usr/lib, /user/local/lib .  All seems correct and undisturbed.

4) Now I started wondering, no permission, login working... what about logs
in /var/log

syslog.log
Feb 11 14:38:38 jsrawan-xps16 x-session-manager[2347]: WARNING: Could not connect to ConsoleKit: Could not get owner of name 'org.freedesktop.ConsoleKit': no such name
Feb 11 14:38:38 jsrawan-xps16 x-session-manager[2347]: WARNING: Could not connect to ConsoleKit: Could not get owner of name 'org.freedesktop.ConsoleKit': no such name
Feb 11 14:38:42 jsrawan-xps16 NetworkManager[1173]: <warn> error requesting auth for org.freedesktop.NetworkManager.use-user-connections: (5) Remote Exception invoking org.freedesktop.PolicyKit1.Authority.CheckAuthorization() on /org/freedesktop/PolicyKit1/Authority at name org.freedesktop.PolicyKit1: org.freedesktop.DBus.Error.ServiceUnknown: The name org.freedesktop.PolicyKit1 was not provided by any .service files
Feb 11 14:38:42 jsrawan-xps16 NetworkManager[1173]: <warn> error requesting auth for org.freedesktop.NetworkManager.network-control: (5) Remote Exception invoking org.freedesktop.PolicyKit1.Authority.CheckAuthorization() on /org/freedesktop/PolicyKit1/Authority at name org.freedesktop.PolicyKit1: org.freedesktop.DBus.Error.ServiceUnknown: The name org.freedesktop.PolicyKit1 was not provided by any .service files
auth.log
Feb 11 16:08:39 jsrawan-xps16 login[3660]: pam_unix(login:session): session opened for user testuser by jsrawan(uid=0)
Feb 11 16:08:39 jsrawan-xps16 login[3660]: pam_unix(login:session): session closed for user testuser
Okay the auth login doesn't given any better errors.  Syslog doesn't seem good, but I have no clue if that is the problem.  One thing I realize si the Syslog is occuring on bootup, not PER login.  So I rule that out for now assuming its a red herring.

5) 3 hours and many missteps and reboots, I remain confused, but convinced that only way an executable can't run is corruption or permission.  The former is not possible because under my root account its the same problem

So I do an strace on "login jag". I find the failure is indeed just the /bin/bash

strace -s 10000 -vfo login.jag login jag
4135  execve("/bin/bash", ["-bash"], ["TERM=xterm", "LANG=en_US.UTF-8", "HOME=/", "SHELL=/bin/bash", "USER=jsrawan", "LOGNAME=jag", "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games", "APPLICATION_ENV=development", "MAIL=/var/mail/jag", "HUSHLOGIN=FALSE"]) = -1 EACCES (Permission denied)
Awe, I realize that the /bin/bash is trying to access /HOME, which is probably fine and then it will cd to your ~.  But the /home is usually root owned, and what did it provide for 'other' permissions?  What should the permission be?

Okay to cut a long story short, I ran
$ls -alF /home /
And found out that the actually DIRECTORY was a 600 (rw- --- ---).  This meant that the bash login could not launch in that directory, which is a problem.
How bad was it ?  I read that you can get a Manifest file on what permission should be. Luckily for me I can tell that all the permission of files throughout the system seem good (i.e. I didn't run a chown -R, or chmod -R).  But the root "/" are 600.  I logged into another ubuntu server in the cloud, and low and behold, ALL ROOT FOLDERS are 755, with exception of paths /root and /lost+found are 600 and tmp is 777.

And there you go.  So the moral of the story is, if Linux breaks, 9/10 times its YOU that broke it.  Review your ~/.bash_history to see if you ran something malicious with chmod or chown.  And this is not Window, you do not need to re-install the OS everytime a registry settings goes bad.

Some other things I did that didn't help but could in other cases
- Ensure that passwd file is valid with $ pwconv
- run $ passwd -u <name>  to ensure the account is not locked out
- Create a new user with useradd, passwd to have a fresh example
- Removed the ~/.Xsession in case we have some cached login
- Never change permission on system files, without 100% being sure.  You can
- Ran apt-get upgrade --show-upgraded in case we had some broken packages
- Ran dpkg-reconfigure gdmsetup  ubuntu-desktop to check if packages had been deselected
- Ran ssh login to see if it was isolated to gnome or just general login.  

Tuesday, January 24, 2012

Tech : Extending PHPUnit for Data Driven testing

Testing for PHP?  Look no further then PHPUnit(http://www.phpunit.de).  This framework has all the spunk that junit provides for the java platform.

However once pushing onto a full team, I found things I liked about my homegrown testing platform written in Perl.  Perhaps we can take some of PHPUnit and adapt it for our needs?

However phpunit is so tightly integrated into JetBrain's PHPStorm and all the functionality it brings via test suites it be a pain to rewrite, especially for testing.  Luckily, the implementation was done in full OO form, so we can subclass the phpunit classes, since its fully OO.

Here's my additional requirements

R-1) For all test we want folders input, expected, output.   The former 2 are in version control which defines the test.  The later output, is generated for each test run.  When finished we want the outputs to be automatically compared to ExpectedOutput which is in version control.

The names of the folder are:
 
output/test_
input/test_
expectedoutput/test_

- is a unique test identifier, preferrably alphanumeric.
- is the name of your test.


R-2) The test cases to be NICELY formatted so all test and output are clearly reviewable.
{erformance, time and memory should be dumped after each test.  We can diff these runs over time to measure performance and memory impacts as well.
i.e.

 

BTST_Formatted_TestSuite::setUp()

+++++++++++++++++++++++++++++++++
------------------------------------------------
---------------Starting: 'test0a_enumerationException'  Date: '2012-01-24T22:57:04+05:30'
------------------------------------------------
    ***Memory at End: 12.917344 MB***

Date:  2012-01-24T22:57:05+05:30 
Duration:  0.037098169326782 
FINISHED with result: SUCCESS()
-------------------------------------------- 


R-3) All of this should be summed together into suite summary and reason of failure at the end for any cases
 
BTST_Formatted_TestSuite::tearDown()
****************************************************************************
**************** TestSuite BCOM_SegmentFormat_AllTests FAILED
****************
**************** Total Cases: 21
**************** Total Passes: 20
**************** Total Failed: 1
**************** Total Errors: 1
**************** Total Assertions: 114
**************** Total Time: 0.52025103569031 s
****************************************************************************
There was 1 failure:

1) Warning
Test method "test4bHelper" in test class "SegmentFormatTest" is not public.

Btst_Formatted_TestSuite.php:35

There was 1 error:

1) SegmentFormatTest::test5b_MsgGrpLoopbackOutOfOrderNominal
MultipleFirstSegmentTypeException: Recieved Multiple segments for first segType, abort queue 

/SomeFile1.php:615
/SomeFile1.php:712
/SomeFile2:468
/SomeFile3:35
E_CORE_WARNING: PHP Startup: Unable to load dynamic library '/usr/local/zend/lib/php_extensions/imagick.so' - libMagickWand.so.2: cannot open shared object file: No such file or directory
#0 Unknown


Time: 1 second, Memory: 16.75Mb

OK (0 tests, 0 assertions)

Okay so that is requirements.  How do we achieve this?
In section 19 of PhpUnit Manual (http://www.phpunit.de/manual/current/en/extending-phpunit.html), it discusses how to extend.



Okay now we have the design done, lets get to some design details

Figure 1: Static UML Diagram for PHPUnit DD

1) Btst_Framework_TestCase.  Here you must save the current name of the test and use that as per (R-1).  This class deals with providing generic functions like database loading for all your data. It also deals with getting the input/output/expected directories and comparing files in a folder.

 
/*
*
* Purpose of class is to allow expected results, inputs and outputs
* to be auto managed
*
*/
class Btst_Framework_TestCase extends PHPUnit_Framework_TestCase
{

    const INPUT_DIR = 'inputs';
    const OUTPUT_DIR = 'output';
    const EXPECTED_DIR = 'expectedoutput';
    
    protected $_fullUnitPath = NULL;
    
    protected $_buildExpectedInputPaths = false;
    
    public function getInputD()
    {
        if ( ! $this->_fullUnitPath )
        {
            throw new BsmsObjectNotInitialized("_fullUnitPath" . "\nPlease follow the Setup Before Class as per BtstStackTest");
        }
        $osCommands = new ButlOsCommands();
        $inputDir = $osCommands->joinPath($this->_fullUnitPath , self::INPUT_DIR);   
        $dir = $osCommands->joinPath($inputDir , $this->getName());
        
        return $dir;
    }
    
    public function getOutputD()
    {
        if ( ! $this->_fullUnitPath )
        {
            throw new BsmsObjectNotInitialized("_fullUnitPath" . "\nPlease follow the Setup Before Class as per BtstStackTest");
        }
        $osCommands = new ButlOsCommands();
        $inputDir = $osCommands->joinPath($this->_fullUnitPath , self::OUTPUT_DIR);   
        $dir = $osCommands->joinPath($inputDir , $this->getName());
        return $dir;
    }
    
    
    public function getExpectedD()
    {
        if ( ! $this->_fullUnitPath )
        {
            throw new BsmsObjectNotInitialized("_fullUnitPath" . "\nPlease follow the Setup Before Class as per BtstStackTest");
        }
        $osCommands = new ButlOsCommands();
        $inputDir = $osCommands->joinPath($this->_fullUnitPath , self::EXPECTED_DIR);   
        $dir = $osCommands->joinPath($inputDir , $this->getName());      
        return $dir;
    }
    
    
    //does a recusive rmdir
    public static function rrmdir($dir)
    {
        if (is_dir($dir))
        {
            $objects = scandir($dir);
            foreach ($objects as $object)
            {
                    $osCommands = new ButlOsCommands();
                    $filePath = $osCommands->joinPath($dir ,$object);              
                    if ($object != "." && $object != "..")
                    {
                        if (filetype($filePath) == "dir")
                        {
                            self::rrmdir($filePath);
                        }
                        else
                        {
                            unlink($filePath);
                        }
                    }
            }
            reset($objects);
            rmdir($dir);
        }
    }
    
    public static function setUpBeforeClassClearOutDir( $myUnitTestDir )
    {
    
        //@TODO:  Need to fix on windows before continuing.
        if ( ! $myUnitTestDir )
        {
            throw new BsmsObjectNotInitialized( $myUnitTestDir . "\nPlease follow the Setup Before Class as per BtstStackTest" );
        }         
        if ( ButlOsEnum::thisOS() == ButlOSEnum::LINUX() ||  ButlOsEnum::thisOS() == ButlOSEnum::MAC() || ButlOsEnum::thisOS() == ButlOSEnum::WINDOWS() )
        {
            $osCommands = new ButlOsCommands();
            $outdir = $osCommands->joinPath($myUnitTestDir, self::OUTPUT_DIR);
            if ( is_dir($outdir) )
            {
                self::rrmdir( $outdir);
            }
        }
    }
    
    
    
    
    protected function setUp()
    {
    
        //@TODO: Need to fix on windows before continuing.
        if ( ButlOsEnum::thisOS() == ButlOSEnum::LINUX() ||  ButlOsEnum::thisOS() == ButlOSEnum::MAC() ||  ButlOsEnum::thisOS() == ButlOSEnum::WINDOWS()  )
        {
            mkdir( $this->getOutputD(), 0775, true);
            
            if ( $this->_buildExpectedInputPaths)
            {           
                if (!is_dir($this->getInputD()) )
                {
                    mkdir( $this->getInputD(),   0775, true);
                }
            
                if (!is_dir($this->getExpectedD()) )
                {
                    mkdir( $this->getExpectedD(),  0775, true);
                }
            }
        }    
    }
    
    protected function tearDown()
    {
        //check expected output and fail if
        //missing file on either side, or doesn't match.
        
        //Need to fix on windows before continuing.
        
        $memoryInMb = memory_get_usage()/1e6;
        $str = '***Memory at End: ' . $memoryInMb . " MB***\n";
        print $str;
        
        //@TODO: Need to fix on windows before continuing.
        if ( ButlOsEnum::thisOS() == ButlOSEnum::LINUX() ||  ButlOsEnum::thisOS() == ButlOSEnum::MAC() || ButlOsEnum::thisOS() == ButlOSEnum::WINDOWS() )
        {              
            $filesExp =  $this->getAllInDir( $this->getExpectedD() );
            $filesNew =  $this->getAllInDir(    $this->getOutputD() );
            
            $compareFiles = $this->compareDir ($filesExp, $filesNew );
            
            
            $isPassed = true;
            
            if ( count($filesExp) -> 0 )
            {
                print "+++Error: Files were missing from output+++\n";
                print_r ( array_values($filesExp) );
                $isPassed = FALSE;
            }
            if ( count($filesNew) -> 0 )
            {
                print "+++Error: Extra Files found in output+++\n";
                print_r ( array_values($filesNew) );
                $isPassed = false;
            }
            
            
            $notMatchedArray = array();
            foreach ($compareFiles as $fileCompare )
            {
                $osCommands = new ButlOsCommands();
                $outputPath = $osCommands->joinPath($this->getOutputD() , $fileCompare);
                $expectedPath = $osCommands->joinPath($this->getExpectedD(), $fileCompare);
                
                $rv = $this->compareFiles( $outputPath , $expectedPath);
                if (!$rv)
                {
                    array_push($notMatchedArray,  $fileCompare);
                }
                else
                {
                    print "+++Success: $fileCompare +++\n";
                }
            }
            if ( count($notMatchedArray) -> 0 )
            {
                print "+++Error: Files Contents Not Matched +++\n";
                print_r ( array_values($notMatchedArray) );
                $isPassed = false;
            }
            if (!$isPassed)
            {
                print  "To Compare: " . PHP_EOL . "WinmergeU " . $this->getExpectedD() . " " . $this->getOutputD() . PHP_EOL;            
            }
            $this->assertTrue($isPassed, "The Test Results Data did not match expected");
        }    
    }
    
    
    /**
    * This Function compares files into two directory and returns the files that
    * are the same.  It also returns the input parameters with files that only exist in those directory
    * The caller can use this for further decisions.
    * The assumption is the directory list is sorted.
    * @array $expectedlist
    * @array $newList
    * @array $comparables
    */
    public function compareDir(&$expectedlist, &$newList)
    {
        //if sorted just go down linearly.
        $comparables = array();
        $count1 = count($expectedlist);
        $count2 = count($newList);
        
        
        
        if ($count1 <= 0 || $count2 <= 0 )
        {
            return array();
        }
        
        $i = 0;
        $j = 0;
        do
        {
            $cmp = strcmp( $expectedlist[$i], $newList[$j] );
            if ( $cmp != 0  )
            {
            if ( $cmp -> 0)
            {
                $i++;
            }
            else
            {
                $j++;
            }
            
            }
            else
            {
                array_push($comparables, $expectedlist[$i]);
                array_splice($expectedlist, $i, 1);
                array_splice($newList, $j, 1);
            }
            
        } while ($i < count($expectedlist) && $j < count($newList) );
        
        return $comparables;
        }
        
        public function compareFiles($file1, $file2)
        {
            $isMatch = FALSE;
            $contents1 = file($file1, FILE_IGNORE_NEW_LINES | FILE_SKIP_EMPTY_LINES   );
            $contents2 = file($file2, FILE_IGNORE_NEW_LINES | FILE_SKIP_EMPTY_LINES   );
            $t1 = array_diff($contents1, $contents2) ;
            $t2 = array_diff($contents2, $contents1) ;
            if (  count ( $t1 )  == 0 &&  count ( $t2 ) == 0)
            {
            $isMatch = TRUE;
        }
        return $isMatch;
    }
    
    /*
    * @string $path
    * @array $dir
    */
    public function getAllInDir($path)
    {
        $dir = array();
        
        if ( is_dir($path) )
        {
            $directory = dir($path);
            
            if (!$directory) return array();
            
            while ( false !== ($entry = $directory->read()) )
            {
                $osCommands = new ButlOsCommands();
                $filePath = $osCommands->joinPath($path , $entry);           
                if( is_file($filePath))
                {
                    array_push( $dir, $entry);
                }
            }
        }
        return $dir;
    }
    
    
    public function readInput($file, $mode)
    {
        $osCommands = new ButlOsCommands();
        $filePath = $osCommands->joinPath($this->getInputD() , $file);       
        $output = file_get_contents ( $filePath );
        return $output;    
    }
    
    public function writeOutput($file, $output)
    {
        $osCommands = new ButlOsCommands();
        $filePath = $osCommands->joinPath( $this->getOutputD() , $file);            
        file_put_contents( $filePath,  $output);
        return TRUE;
    
    }
    
}
2) Now create a Test Listener which allows us to format different types of failures..
 

class Btst_SimpleTestListener extends PHPUnit_TextUI_ResultPrinter
{
    
    /**
    * @var integer
    */
  protected $_numAssertions = 0;
  
  public function addError(PHPUnit_Framework_Test $test,
           Exception $e,
           $time)
  {
    printf(
      "Error while running test '%s'.\n",
      $test->getName()
    );
  }
 
  public function addFailure(PHPUnit_Framework_Test $test,
             PHPUnit_Framework_AssertionFailedError $e,
             $time)
  {
    printf(
      "Test '%s' failed.\n",
      $test->getName()
    );
  }
 
  public function addIncompleteTest(PHPUnit_Framework_Test $test,
                    Exception $e,
                    $time)
  {
    printf(
      "Test '%s' is incomplete.\n",
      $test->getName()
    );
  }
 
  public function addSkippedTest(PHPUnit_Framework_Test $test,
                 Exception $e,
                 $time)
  {
    printf(
      "Test '%s' has been skipped.\n",
      $test->getName()
    );
  }
 
  public function startTest(PHPUnit_Framework_Test $test)
  {
   $name = $test->getName();
    $nowDate = new ButlDateTimeUtils();
    $nowDateStr = $nowDate->retrieveUTCFormat();
    $buffer = sprintf(
      "\n------------------------------------------------" .
      "\n---------------Starting: '%s'  Date: '%s'" .
      "\n------------------------------------------------\n",
      $name, $nowDateStr
    );
    print ($buffer);

        
  }
     /**
     * @param  PHPUnit_Framework_Test $test
      * @param float $time
     */
  public function endTest(PHPUnit_Framework_Test $test, $time)
  {
   
    $result = $test->getResult();
    $this->writeGlobalProgress($test);

    $nowDate = new ButlDateTimeUtils();
    $nowDateStr = $nowDate->retrieveUTCFormat();

    
    //$success = $result->wasSuccessful();
    $msg = $test->getStatusMessage();
    $success = $test->hasFailed() == TRUE ? "FAIL":"SUCCESS";
 
    
    printf(
        "\n-->Date:  $nowDateStr " .
        "\n-->Duration:  $time " .
       "\n-->FINISHED with result: $success($msg)"
    );
  }
 
  public function startTestSuite(PHPUnit_Framework_TestSuite $suite)
  {
    printf(
"\n****************************************************************************" .
"\n****************TestSuite '%s' started" .
"\n****************************************************************************\n" ,
      $suite->getName()
    );
    
    
  }
 
  public function endTestSuite(PHPUnit_Framework_TestSuite $suite)
  {
  
   if ($suite instanceOf BTST_Formatted_TestSuite)
   {
    $results = $suite->_getSuiteResult();
    $results = $results;
    $resultStr = $results->wasSuccessful() == TRUE ? "SUCCESSFULL":"FAILED";
 printf(
 "\n****************************************************************************".
 "\n**************** TestSuite " . $suite->getName() . " " . $resultStr .
 "\n****************" .
 "\n**************** Total Cases: ". $results->count()  .
 "\n**************** Total Passes: ". ($results->count() - $results->failureCount()).
 "\n**************** Total Failed: ". $results->failureCount() .
 "\n**************** Total Errors: ". count($results->errors()) .
 "\n**************** Total Assertions: ". $this->_numAssertions .
 "\n**************** Total Time: ". $results->time() . " s" .
 "\n****************************************************************************\n");
     
        $this->printFailures($results);
        $this->printErrors($results);
   }
    
    else
    {
     printf("\n**************** TestSuite " . $suite->getName() . " Finished");
    }
    
    //Results
    
    
  }
  
  
   /**
     *
     */
    public function writeGlobalProgress($testCase)
    {
     $this->_numAssertions += $testCase->getNumAssertions();
    }
}

3) The TestSuite is required to attach a listener to the TestResult that we defined above.  We can setup other things like how errors are shown and strict mode testing
 
class Btst_Formatted_TestSuite extends PHPUnit_Framework_TestSuite
{
 
 /*
 * This override the createResult to include our customer listner
 *
 */
 
 protected $_suiteResult = NULL; //hold the result so we can use it later
 
 public function createResultForListner()
 {

  $this->_suiteResult = new PHPUnit_Framework_TestResult;
  $this->_suiteResult->strictMode(TRUE);
  $this->_suiteResult->convertErrorsToExceptions(TRUE);
  $this->_suiteResult->addListener(new Btst_SimpleTestListener);
  return $this->_suiteResult;
 }
 
 public function _getSuiteResult()
 {
  return $this->_suiteResult;
 }
    
    /*
    * For some reason our createResult does not get called, so I had to override run.
    *
    */
    public function run(PHPUnit_Framework_TestResult $result = NULL, $filter = FALSE, array $groups = array(), array $excludeGroups = array(), $processIsolation = FALSE)
    {
        $result = $this->createResultForListner();
        PHPUnit_Framework_TestSuite::run($result, $filter, $groups, $excludeGroups, $processIsolation);
    }
    
    protected function setUp()
    {
    print "\nBTST_Formatted_TestSuite::setUp()";
    print "\n+++++++++++++++++++++++++++++++++";
    
    }
    
    protected function tearDown()
    {
    print "\n+++++++++++++++++++++++++++++++++";
    print "\nBTST_Formatted_TestSuite::tearDown()";
    }
    
}
4) Finally here's a typical test case.  There is not much change except we can save outputs and let the rest do its magic.  You can also run the without the special formatting if you want to revert back to phpunit default settings.
 
/*
* Test Driver for the ArrayTest test suite
*  phpunit --verbose  BCOM_SegmentFormat_AllTests BcomSegmentFormatTest.php
* The following line will run without the all the extensions.
*  phpunit --verbose  BcomSegmentFormatTest.php
*  
*/

class BCOM_SegmentFormat_AllTests
{
    public static function suite()
    {
        $suite = new Btst_Formatted_TestSuite('BcomSegmentFormatTest');
        return $suite;
    }
    
    public static function suiteRollUp()
    {
        $suite = new PHPUnit_Framework_TestSuite('BcomSegmentFormatTest');
        return $suite;
    }
}
    
    
class BcomSegmentFormatTest extends Btst_Framework_TestCase
{
    //*************INSERT THESE THREE FUNCTIONS IN ALL TEST CASES*************************//
    static function setUpBeforeClass()
    {
        Btst_Framework_TestCase::setUpBeforeClassClearOutDir( realpath(dirname(__FILE__) ) );
        
        //Need to do special on windows!!!
        Btst_Framework_TestCase::loadDatabase_Helper(true, "password", "default");
    }
    
    
    protected function setUp()
    {
    
        _fullUnitPath = realpath(dirname(__FILE__) );
        Btst_Framework_TestCase::setUp();
    }
    //**********************************************************************************//

    protected function tearDown()
    {
        Btst_Framework_TestCase::tearDown();
        //do gnokii tear down
    }

public function test10_testLocalIncomingDropbox()
    {
         
        $allInputs = $this->getAllInDir ( $this->getInputD() );
        $file = doStuff($allInputs);
        copy( $file, $this->getOutputD() );

}


So there you go.  The output at the end now outputs more information about the failures so you don't have to go trolling through the log.  Also the formatting breaks of +++ and **** allow for better visual clues about which output is associated with a given test, and allow for further processing as needed.   So thats about it.  If you want to also run an entire test suite from a top level acceptance, the following snippet can be used.


 
/*
* Runs all tests in our acceptance suite.
* Create a test suite that contains the tests
* from the ArrayTest class.
* Just in one of the following ways.
*  phpunit --verbose AcceptanceTestSuite.php
*  phpunit --filter test2_pop AcceptanceTestSuite.php
*/
class AcceptanceTestSuite
{

public static function suite()
{
    $suite = new Btst_Formatted_TestSuite('Executing ... Your Test Suite');
    $suite->addTest(BCOM_SegmentFormat_AllTests::suiteRollUp());
    $suite->addTest(BBBB_Business_AllTests::suiteRollUp());
    return $suite;
}
}

Sunday, January 22, 2012

Tech : Singleton MDB on Glassfish 3 on OpenMq

With EJB 3.1 specification it is possible to specify MDB (message driven beans) that can be included in a glassfish web profile and packaged into .WAR file.  This enables pooling out of the box with minimal configuration.

With all these new changes, sometimes it gets quite confusing for IDE like Netbeans to keep up.  In my experience configuring JMS on Netbeans was bit of chore once you start customizing.   I needed to connect to a public REST interface with a rate limit, so I actually only wanted one JMS message at time to be processed, which was more work then I thought.

Its key to note the pool size of the CONSUMER and CONNECTION are not what makes the JMS consumer concurrent, instead its the MDB bean pooling.  This is important to note before you start changing your <connector-connection-pool> with max-pool-size="1" steady-pool-size="1".  Doing so will only change the max connections allowed, which is not what you want typically and cause timeouts as soon as more than one client connects.  Also imqConnectionFlowLimit="50"  imqConsumerFlowLimit = "1000", only change how we batch messages from the queue broker to the consumer.  It again does not impact concurrency.

Finally using synchronized keyword has no impact on a MDB.  It is because the MDB is design to run on clusters, which would not make is possible to synchronize across a cluster.  You should never try to manage threading when using any bean, including an MDB.

Below are steps to configure simple MDB with Queue Connection.  I'm use CMT (container managed transactions) and also ObjectMessage DTO to serialize the message.  A thread sleep is put into to achieve quick and dirty rate buferring.

1) You need to add your queue broker via jar files, imq.jar and jms.jar to your project http://mq.java.net/.

2) Add Connector resources for the jms/javaee6/FranchiseLocationQueue in glassfish-resources.xml (note old name sun-resources.xml)
 

        
            
            
    
    
3) Create a message driven consumer.  Note the transaction are managed by the container and connection by the persistence unit.  Transaction is started automatically before OnMessage();
 
@MessageDriven(mappedName = MMSG_FranchiseLocationConsumer.QUEUE_NAME,  description="Geoencodes Franchises", activationConfig = {
        @ActivationConfigProperty(  propertyName = "destinationType", propertyValue = "javax.jms.Queue"),                                                                                  
        @ActivationConfigProperty(  propertyName = "acknowledgeMode", propertyValue = "Auto-acknowledge"),  
} )
public class MMSG_FranchiseLocationConsumer implements MessageListener 
{
    
    public final static String CONN_FACTORY_NAME = "jms/javaee6/FranchiseLocationFactory";    
    public final static String QUEUE_NAME = "jms/javaee6/FranchiseLocationQueue"; 

    @PersistenceContext(unitName = "myPU")
    private EntityManager _em;
       
    //This is the context of this bean.
    @Resource
    private MessageDrivenContext context;      
    
    private static Log log = LogFactory.getLog(MMSG_FranchiseLocationConsumer.class);   
    
    // Constructor. Establish JMS publisher and subscriber 
    public MMSG_FranchiseLocationConsumer() throws Exception 
    {        
            log.info("Starting JMS Queue" + QUEUE_NAME);
    }
        
    // Receive message from topic subscriber  
    
    /*
     * The default is for the container to start a transaction before the onMessage method is invoked and will commit the transaction when the method returns, 
     * unless the transaction was marked as rollback through the message-driven context. There is more about transactions to discuss but for our discussion of MDBs, 
     * this will suffice. 
     */
    //
    @Override
    public void onMessage(Message message) {

        String msgId = "";
        MMSG_FranchiseDTO franchiseDTO = null;
        try 
        {
            ObjectMessage objectMessage = (ObjectMessage) message;
            
            msgId = objectMessage.getJMSMessageID();
            franchiseDTO = (MMSG_FranchiseDTO) objectMessage.getObject();
            
            log.info("Recieved Msg with Id:" + msgId + "with city:" + franchiseDTO.getFr().getAddress1() );           
        } 
        catch (JMSException e)
        { 
            log.debug("Failed to process message: " + msgId + " with exception" + e);
            context.setRollback(); return;
        }         
        addFranchiseToDb( franchiseDTO.getFr(), franchiseDTO.getBusinessGroupId() );
    }
    
    private void addFranchiseToDb(FranchisesRaw fr, int businessGroupId)
    {
        try
        {
            Thread.sleep(1); //note we could be prudent and keep the last accessed time and put a timer on it.
        }
        catch (InterruptedException e)
        {
            log.error("Interrupted Exception" + e);
        }                
        //Call your rate limited API here
    }         
}

4)  Add producer code in function as you'd like
Context ctx = null;           
ctx = new InitialContext();
_connFactory = (ConnectionFactory) ctx.lookup(MMSG_FranchiseLocationConsumer.CONN_FACTORY_NAME);
_franchiseQueue = (Queue) ctx.lookup(MMSG_FranchiseLocationConsumer.QUEUE_NAME);
initConnection();            
Session session = _connection.createSession(true, Session.AUTO_ACKNOWLEDGE);
MessageProducer producer = session.createProducer(_franchiseQueue);
ObjectMessage message = session.createObjectMessage();
message.setJMSMessageID("test message 1");
message.setObject( new MMSG_FranchiseDTO( franchiseRaw, businessGroupId ) );
producer.send(message);
session.close();
_connection.close();

5) Add the DTO object that we can pass across as a message (note you can bundle other POJO's like Database Entity files across.
        
public class MMSG_FranchiseDTO implements Serializable
{
   int _businessGroupId;
    FranchisesRaw _fr;
}

6) Now if you want a singleton bean, the following needs to be added manually to the web/WEB-INF folder in glassfish-ejb-jar.xml  By default Netbeans will copy everything in WEB-INF, so you don't need to worry about adding to build-impl.xml.  Also if you ask Netbean to add a"standard-deployment-descriptor" it will only add a ejb-jar.xml, which DOES NOT have all the options you get from the glassfish deployment descriptor.

  
  First Module
  
    
      MMSG_FranchiseLocationConsumer
      jms/javaee6/FranchiseLocationQueue
      
        1
        1
        1
        600
      
    
    
      singleton-bean-pool
      true
    
  


And there you go.  You can manipulate the pool size and make the MDB singleton or not as you wish. The JMS by default will run on the same JVM in "embedded" mode, which in my case was exactly what I needed.