Monday, August 4, 2014

Matthew Mischke - Week 4: Quality Control


I spent this week working on an important part of the experimental process, data processing and quality control. As I've mentioned before, up until this point most of the work in the office has been data entry. I've spent pretty much every day going through the data collected by Kryger, Pearcy and their colleagues in the binders full of data sheets from 1977-1979, putting that information in the FOIS database. However, the last few beam trawls were added to the database at the end of last week so now that data has to be processed. That means that this week was spent checking random records using the random number generator in Excel to produce SBMT's and cross-checking every fish length, depth, tow length, and note recorded in the database with it's corresponding datasheet from the binders. Together, Bridget and I processed and reviewed 119 records and made 217 corrections to spelling errors, number typos or species misplacements and omissions. We also made a combined 40 corrections to notes in the database.

After random checks were finished we exported records of all the fish lengths recorded in the historic data into an Excel sheet and looked at each fish species individually with lengths sorted from shortest to longest and longest to shortest, searching for outliers, values that lie far outside the distribution of the rest of the data. We found a few more errors that way, and they were reconciled in the database.

In addition to these kinds of quality control, we also went back through the database and created a new length type (UN) for unmeasured fish marked in the notes as either not getting measurements, being in pieces so a measurement would be inaccurate, or just for general notes of organisms being caught and dumped overboard. We created records for all of these fish this way so accurate counts and database-wide trends could be seen easier. We also did some clean up by mapping out the historic cruises and looking for outliers on the map as indications of incorrect coordinates in the database.

Overall, it wasn't an exciting week but the work needed to be done and has moved us a lot closer to starting some actual analysis, and therefore closer to having some results to report for many years of hard work. Its really encouraging to see things moving forward. I'll be gone next week on a spiritual retreat but I'll be back to post in week 6.

Thanks for reading,

No comments:

Post a Comment