Data quality and data governance are powerful discussions and can have a huge impact on the ways in which data can be used in analysis. It certainly makes things clearer when we have consistency of data entry, code sets, workflow, etc. It occurred to me today, though, that the big data movement gives us at least one technique for dealing with suboptimal data quality. (This is especially great for those of us who work in healthcare and like to complain about how contaminated our data environments are.)
At StampedeCon this past week, Kilian Weinberger described machine learning (a key technique in big data analysis) this way:
- Traditional computer science takes input data and program instructions to generate output data.
- Machine learning takes input data and output data to generate inferred program instructions.
This approach begs the question: who are we to judge the data? If we have no program instructions to which the data is expected to either comply or produce particular results, then who are we to judge the so called data quality? Let the machine be the judge of the data and infer from it what there is to infer, regardless of what our preconceived notions of quality might be.
Even the quality of the predictive models that come out of machine learning aren't really a judgement of data quality. In most cases, the input and output used to train machine learning algorithms are the input and output of some other human process or more complex workflow. If a good machine learning algorithm can't create a highly predictive model from that input and output, its probably an indication that the existing process is somewhat indeterminate. That represents a measure of process quality, not data quality.
I may be over-reaching a bit on my desire to throw data quality arguments out the window. It's just that I've heard data quality used as an excuse too many times in my career, when many of those cases were just a matter of not trying hard enough to understand what was really going on with the data.
This comment has been removed by the author.
ReplyDeleteThis comment has been removed by the author.
ReplyDeleteGood Post! Thank you so much for sharing this pretty post, it was so good to read and useful to improve my knowledge as updated one, keep blogging.
ReplyDeleterpa training in velachery| rpa training in tambaram |rpa training in sholinganallur | rpa training in annanagar| rpa training in kalyannagar
I really like your blog. You make it interesting to read and entertaining at the same time. I cant wait to read more from you.
ReplyDeleteData Science course in Chennai | Data science course in bangalore
Data science course in pune | Data science online course
Python course in Kalyan nagar
Were a gaggle of volunteers as well as starting off a brand new gumption within a community. Your blog furnished us precious details to be effective on. You've got completed any amazing work!
ReplyDeletepython training in rajajinagar | Python training in btm | Python training in usa
Howdy, would you mind letting me know which web host you’re utilizing? I’ve loaded your blog in 3 completely different web browsers, and I must say this blog loads a lot quicker than most. Can you suggest a good internet hosting provider at a reasonable price?
ReplyDeleteBest AWS Training Institute in BTM Layout Bangalore ,AWS Coursesin BTM
Best AWS Training in Marathahalli | AWS Training in Marathahalli
Amazon Web Services Training in Jaya Nagar | Best AWS Training in Jaya Nagar
AWS Training in BTM Layout |Best AWS Training in BTM Layout
Quickbooks enterprise support number +1 (833) 400-1001 is available for troubleshooting QuickBooks Enterprise through QuickBooks Enterprise Support. Call our Quickbooks support team at +1 (833) 400-1001 and contact our certified QuickBooks specialist for help.
ReplyDeleteIt’s great to come across a blog every once in a while that isn’t the same out of date rehashed material. Fantastic read.
ReplyDeleteData science Course Training in Chennai |Best Data Science Training Institute in Chennai
RPA Course Training in Chennai |Best RPA Training Institute in Chennai
AWS Course Training in Chennai |Best AWS Training Institute in Chennai
Devops Course Training in Chennai |Best Devops Training Institute in Chennai
Selenium Course Training in Chennai |Best Selenium Training Institute in Chennai
Java Course Training in Chennai | Best Java Training Institute in Chennai
keep posting
ReplyDeleteinterview-questions/aptitude/permutation-and-combination/how-many-groups-of-6-persons-can-be-formed
tutorials/oracle/oracle-delete
technology/chrome-flags-complete-guide-enhance-browsing-experience/
interview-questions/aptitude/time-and-work/a-alone-can-do-1-4-of-the-work-in-2-days
interview-questions/programming/recursion-and-iteration/integer-a-40-b-35-c-20-d-10-comment-about-the-output-of-the-following-two-statements
good....
ReplyDeletedominican republic web hosting
iran hosting
palestinian territory web hosting
panama web hosting
syria hosting
services hosting
afghanistan shared web hosting
andorra web hosting
belarus web hosting
good....
ReplyDeletebrunei darussalam hosting
inplant training in chennai
good blogger..
ReplyDeletedenmark web hosting
inplant training in chennai
awesome..
ReplyDeleteinplant training in chennai
inplant training in chennai
inplant training in chennai for it.php
italy web hosting
This comment has been removed by the author.
ReplyDeleteThis comment has been removed by the author.
ReplyDeleteExcellent post...very useful...
ReplyDeletepython training in chennai
internships in hyderabad for cse 2nd year students
online inplant training
internships for aeronautical engineering students
kaashiv infotech internship review
report of summer internship in c++
cse internships in hyderabad
python internship
internship for civil engineering students in chennai
robotics course in chennai
hii nyc..gud..
ReplyDeleteinternships for cse students in bangalore
internship for cse students
industrial training for diploma eee students
internship in chennai for it students
kaashiv infotech in chennai
internship in trichy for ece
inplant training for ece
inplant training in coimbatore for ece
industrial training certificate format for electrical engineering students
internship certificate for mechanical engineering students
Such an exceptionally valuable article. Extremely intriguing to peruse this article. I might want to thank you for the endeavors you had made for
ReplyDeletecomposing this amazing article.
digital marketing blog
digital marketing course fees
seo training in chennai
digital marketing blogs
blog for digital marketing
blog about digital marketing
digital marketing bloggers
digital marketing resources
search engine optimization guide
free search engine optimization tutorials
free SEO tutorials
seo training tutorials
digital marketing tutorials
free digital marketing resources
free SEO
Great post i must say and thanks for the information. Education is definitely a sticky subject. However, is still among the leading topics of our time. I appreciate your post and look forward to more.
ReplyDeletedata analytics courses
data science interview questions
business analytics courses
data science course in mumbai
I wanted to leave a little comment to support you and wish you a good continuation. .I would like to thank you for the efforts you had made for writing this awesome article.
ReplyDeletepython training in chennai
python online training in chennai
python training in bangalore
python training in hyderabad
python online training
python flask training
python flask online training
python training in coimbatore
Such a very useful article. Very interesting to read this article. I would like to thank you for the efforts you had made for writing this awesome article
ReplyDeleteData Science Training In Chennai | Certification | Data Science Courses in Chennai | Data Science Training In Bangalore | Certification | Data Science Courses in Bangalore | Data Science Training In Hyderabad | Certification | Data Science Courses in hyderabad | Data Science Training In Coimbatore | Certification | Data Science Courses in Coimbatore | Data Science Training | Certification | Data Science Online Training Course
ReplyDeleteNice article and thanks for sharing with us. Its very informative
AI Training in Hyderabad
ReplyDeleteNice article and thanks for sharing with us. Its very informative
AI Training in Hyderabad
ReplyDeleteNice article and thanks for sharing with us. Its very informative
AI Training in Hyderabad
Really nice and interesting post. I was looking for this kind of information and enjoyed reading this one. Keep posting. Thanks for sharing.
ReplyDeletedata science training
This post is so interactive and informative.keep update more information...
ReplyDeleteTally Course in Tambaram
Tally course in Chennai
Excellent effort to make this blog more wonderful and attractive.
ReplyDeletedata science coaching in hyderabad
wow, great, I was wondering how to cure acne naturally. I found your site on Google, learned a lot, and now I'm a bit clearer. I’ve bookmarked your site and also added rss. keep us updated.
ReplyDeletedata science training institute in hyderabad
Such a priceless piece of information. It was quite interesting to read this article. I would want to thank you for your efforts in writing this fantastic essay.
ReplyDeleteCMA Coaching in Hyderabad
Everyone is talking the same thing over and over again, but I had the opportunity to find some beneficial facts in your page. I like your writing style and would want to recommend your blog to my circle of dudes.
ReplyDeleteSAP MM Training in Hyderabad
Thank you for sharing this helpful information and informative blog with us.
ReplyDeleteBest CEC Colleges In Hyderabad
What a thought-provoking post! Your perspective on data judgment challenges common misconceptions and encourages deeper understanding. Keep pushing the conversation forward—your insights are essential for fostering a more informed dialogue!
ReplyDeleteData Science Courses in Singapore
This blog presents a thought-provoking perspective on the relationship between data quality and machine learning, particularly in the context of big data and healthcare. The author's argument that machine learning can help mitigate the challenges posed by suboptimal data quality is compelling. By emphasizing the machine's ability to infer useful insights from imperfect data, the piece challenges traditional notions of data quality assessment. It’s a refreshing take that encourages readers to reconsider how they view and utilize data, especially in complex environments. Overall, it’s an engaging and insightful read that prompts further reflection on the evolving role of data in analysis.
ReplyDeletedata analytics courses in dubai
Nice and very useful post ML
ReplyDeletethanks for sharing
data analytics courses in Singapore
"I took IIM Skills’ Data science while living in Mumbai, and it has been fantastic. The online format fits seamlessly into my schedule."
ReplyDeleteThis perspective on data quality and machine learning is insightful! It’s true that the machine learning approach challenges traditional views on data “cleanliness,” especially since it’s designed to work with real-world data complexities.
ReplyDeleteData science courses in Mysore
Stop Judging My Data emphasizes the importance of examining data without bias or preconceived notions. In data analysis, judgment and assumptions can skew results, leading to misinterpretations or inaccurate conclusions. This concept encourages data professionals to maintain objectivity and allow the data to tell its own story. By resisting the urge to judge, analysts can uncover patterns and insights that may otherwise go unnoticed. Embracing an open, unbiased approach not only strengthens the integrity of the analysis but also fosters trust in data-driven decisions and outcomes.
ReplyDeleteThank you.
Data science Courses in Germany
"Data analysis should be objective and based on facts, not assumptions. Stop judging data without understanding its context, as it can lead to incorrect conclusions."
ReplyDeleteData Science Course in Chennai
"Great post! The demand for data science skills is growing rapidly, and it's exciting to see opportunities available even in regions like Iraq. For those interested, Data science courses in Iraq can provide the perfect start to building a strong foundation in this field. Highly recommended for anyone looking to pursue a career in data science!"
ReplyDeleteThis blog provides an interesting perspective on how data should be understood, not judged. A thought-provoking read for those passionate about data analysis and its interpretation!
ReplyDeleteData science course in Gurgaon
The blog offers an interesting perspective on data analysis, challenging assumptions and emphasizing the importance of understanding the data before making judgments.
ReplyDeleteData Science Course in Delhi
Really amazing blog,thanks for sharing.
ReplyDeleteData science course in Bangalore
Great post! It's a refreshing take on how data should be treated without judgment. Understanding the context and significance of data is crucial before making assumptions or conclusions. Your perspective encourages a more thoughtful approach to working with data. Thanks for sharing this insightful post!
ReplyDeleteData science courses in Bangladesh
Thanks for sharing! Great insights!
ReplyDeleteGST Course
An interesting perspective! Embracing machine learning's ability to work with imperfect data challenges traditional views on data quality. Instead of focusing on judging the data, we can let algorithms infer meaning and patterns, even from "suboptimal" datasets. This shift helps us focus more on improving processes and understanding the data's context rather than getting stuck on perfection. Investment Banking Course
ReplyDelete