Generally speaking, the state of data curation is a mess. If you want to make a real and positive change with your employer or client, volunteer to be the curator over a critical area of data.
Data Curation, In a Nutshell
Although much has been written about proper care and feeding of critical data, very few organizations allocate resources specifically for data curation. The process of cleansing, pruning, and standardizing master data is often an afterthought, if it is included in the project planning at all. Without some measure of data curation, the cornerstones of data-driven organizations begin to deteriorate. Although the absence of a master data curation strategy is rarely fatal, bad data undoubtedly costs time and money, and leads to distrust of the data.
On the other hand, properly caring for critical data improves both quality of and trust in said data. When commonly-used reference data is consistent, atomic, and predictable, data consumers can spend more time focusing on their core functions rather than trying to reconcile questionable or inconsistent information.
Those of us working with data can and should be data curators. No, we probably won’t have that title on our business cards, but the fact is that each of us has domain-specific knowledge of at least one area that we could use to improve the quality and completeness of the data in that domain. For example, when I worked in healthcare, I took on the task of helping to normalize the charge master (the list of items and procedures which would be billed to patients). Even though I had never worked as a caregiver or hospital billing professional, I had learned enough in my 4 years of managing healthcare data to contribute to the data curation of this critical list. From a technical perspective, the work was small, but the positive impact to the organization was significant.
Data curation doesn’t have to be a destructive process. By definition, the word curation implies a slow and methodical process. In fact, much of what you’ll do as a data curator is rooted in process and education rather than technology. A thorough data curation process will often reveal some technical or workflow changes that need to be made, which can then be evaluated on the business value of the change versus the cost of implementing the change. Even if no major technical changes are made, the deficiencies and risks will have been identified, which in itself is valuable.
Be prepared for the fact that being a data curator is usually a thankless job. Sure, there will be short-term acknowledgements when positive changes are made (“Hey, those duplicates have been resolved – thanks!!”), but these disappear over time. Well-curated data is a bit like a kitchen faucet: it just works properly every time. The bottom line is to build a data curation process for business value, not professional acknowledgement.
We Are All Data Curators
Even though the official title of data curator is very rare, each of us working in this field can contribute to the data curation process. Though it’s often a thankless task, its value to data quality and business processes in general is significant.