Date: 2024-12-21 Page is: DBtxt003.php txt00008595 | |||||||||
Ideas | |||||||||
Burgess COMMENTARY | |||||||||
22 tips for better data science These tips are provided by Dr Granville, who brings 20 years of varied data-intensive experience working with successful start-ups, small companies across various industries, and eBay, Visa, Microsoft, GE and Wells Fargo.
Leverage external data sources: tweets about your company or your competitors, or data from your vendors (for instance, customizable newsletter eBlast statistics available via vendor dashboards, or via submitting a ticket) Nuclear physicists, mechanical engineers, and bioinformatics experts can make great data scientists. State your problem correctly, and use sound metrics to measure yield (over baseline) provided by data science initiatives. Use the right KPIs (key metrics) and the right data from the beginning, in any project. Changes due to bad foundations are very costly. This requires careful analysis of your data to create useful databases. Fast delivery is better than extreme accuracy. All data sets are dirty anyway. Find the perfect compromise between perfection and fast return. With big data, strong signals (extremes) will usually be noise. Here's a solution. Big data has less value than useful data. Use big data from third party vendors, for competitive intelligence. You can build cheap, great, scalable, robust tools pretty fast, without using old-fashioned statistical science. Think about model-free techniques. Big data is easier and less costly than you think. Get the right tools! Here's how to get started. Correlation is not causation. This article might help you with this issue. You don't have to store all your data permanently. Use smart compression techniques, and keep statistical summaries only, for old data. Don't forget to adjust your metrics when your data changes, to keep consistency for trending purposes. A lot can be done without databases, especially for big data. Always include EDA and DOE (exploratory analysis / design of experiment) early on in any data science projects. Always create a data dictionary. And follow the traditional life cycle of any data science project. Data can be used for many purposes:
DSC Resources Career: Training | Books | Cheat Sheet | Apprenticeship | Certification | Salary Surveys | Jobs Knowledge: Pure Data Science | Competitions and Challenges | Webinars | Our Book Buzz: Business News | Announcements | Events | RSS Feeds Misc: Top Links | Code Snippets | External Resources | Best Blogs | Subscribe | For Bloggers Additional Reading Data Science Compared to 16 Analytic Disciplines How to detect spurious correlations, and how to find the real ones 17 short tutorials all data scientists should read (and practice) 10 types of data scientists 66 job interview questions for data scientists Follow us on Twitter: @DataScienceCtrl | @AnalyticBridge Views: 1098 Like ShareTwitter |