Executing Data Quality Projects

Executing Data Quality Projects
Author: Danette McGilvray
Publsiher: Academic Press
Total Pages: 376
Release: 2021-05-27
ISBN 10: 0128180161
ISBN 13: 9780128180167
Language: EN, FR, DE, ES & NL

Executing Data Quality Projects Book Review:

Executing Data Quality Projects, Second Edition presents a structured yet flexible approach for creating, improving, sustaining and managing the quality of data and information within any organization. Studies show that data quality problems are costing businesses billions of dollars each year, with poor data linked to waste and inefficiency, damaged credibility among customers and suppliers, and an organizational inability to make sound decisions. Help is here! This book describes a proven Ten Step approach that combines a conceptual framework for understanding information quality with techniques, tools, and instructions for practically putting the approach to work – with the end result of high-quality trusted data and information, so critical to today’s data-dependent organizations. The Ten Steps approach applies to all types of data and all types of organizations – for-profit in any industry, non-profit, government, education, healthcare, science, research, and medicine. This book includes numerous templates, detailed examples, and practical advice for executing every step. At the same time, readers are advised on how to select relevant steps and apply them in different ways to best address the many situations they will face. The layout allows for quick reference with an easy-to-use format highlighting key concepts and definitions, important checkpoints, communication activities, best practices, and warnings. The experience of actual clients and users of the Ten Steps provide real examples of outputs for the steps plus highlighted, sidebar case studies called Ten Steps in Action. This book uses projects as the vehicle for data quality work and the word broadly to include: 1) focused data quality improvement projects, such as improving data used in supply chain management, 2) data quality activities in other projects such as building new applications and migrating data from legacy systems, integrating data because of mergers and acquisitions, or untangling data due to organizational breakups, and 3) ad hoc use of data quality steps, techniques, or activities in the course of daily work. The Ten Steps approach can also be used to enrich an organization’s standard SDLC (whether sequential or Agile) and it complements general improvement methodologies such as six sigma or lean. No two data quality projects are the same but the flexible nature of the Ten Steps means the methodology can be applied to all. The new Second Edition highlights topics such as artificial intelligence and machine learning, Internet of Things, security and privacy, analytics, legal and regulatory requirements, data science, big data, data lakes, and cloud computing, among others, to show their dependence on data and information and why data quality is more relevant and critical now than ever before. Includes concrete instructions, numerous templates, and practical advice for executing every step of The Ten Steps approach Contains real examples from around the world, gleaned from the author’s consulting practice and from those who implemented based on her training courses and the earlier edition of the book Allows for quick reference with an easy-to-use format highlighting key concepts and definitions, important checkpoints, communication activities, and best practices A companion Web site includes links to numerous data quality resources, including many of the templates featured in the text, quick summaries of key ideas from the Ten Steps methodology, and other tools and information that are available online

Executing Data Quality Projects

Executing Data Quality Projects
Author: Danette McGilvray
Publsiher: Elsevier
Total Pages: 352
Release: 2008-09-01
ISBN 10: 0080558399
ISBN 13: 9780080558394
Language: EN, FR, DE, ES & NL

Executing Data Quality Projects Book Review:

Information is currency. Recent studies show that data quality problems are costing businesses billions of dollars each year, with poor data linked to waste and inefficiency, damaged credibility among customers and suppliers, and an organizational inability to make sound decisions. In this important and timely new book, Danette McGilvray presents her “Ten Steps approach to information quality, a proven method for both understanding and creating information quality in the enterprise. Her trademarked approach—in which she has trained Fortune 500 clients and hundreds of workshop attendees—applies to all types of data and to all types of organizations. * Includes numerous templates, detailed examples, and practical advice for executing every step of the “Ten Steps approach. * Allows for quick reference with an easy-to-use format highlighting key concepts and definitions, important checkpoints, communication activities, and best practices. * A companion Web site includes links to numerous data quality resources, including many of the planning and information-gathering templates featured in the text, quick summaries of key ideas from the Ten Step methodology, and other tools and information available online.

Executing Data Quality Projects

Executing Data Quality Projects
Author: Danette McGilvray
Publsiher: Morgan Kaufmann
Total Pages: 325
Release: 2008
ISBN 10: 9780123743695
ISBN 13: 0123743699
Language: EN, FR, DE, ES & NL

Executing Data Quality Projects Book Review:

Introduces a systematic, effective approach to enhancing and creating data and information quality that integrates a conceptual framework with essential tools, techniques, and instructions, accompanied by helpful templates, real-world examples, and advice, as well as highlighted definitions, key concepts, checkpoints, warnings, communication activities, and best practices. Original. (Intermediate)

The Practitioner s Guide to Data Quality Improvement

The Practitioner s Guide to Data Quality Improvement
Author: David Loshin
Publsiher: Elsevier
Total Pages: 432
Release: 2010-11-22
ISBN 10: 9780080920344
ISBN 13: 0080920349
Language: EN, FR, DE, ES & NL

The Practitioner s Guide to Data Quality Improvement Book Review:

The Practitioner's Guide to Data Quality Improvement offers a comprehensive look at data quality for business and IT, encompassing people, process, and technology. It shares the fundamentals for understanding the impacts of poor data quality, and guides practitioners and managers alike in socializing, gaining sponsorship for, planning, and establishing a data quality program. It demonstrates how to institute and run a data quality program, from first thoughts and justifications to maintenance and ongoing metrics. It includes an in-depth look at the use of data quality tools, including business case templates, and tools for analysis, reporting, and strategic planning. This book is recommended for data management practitioners, including database analysts, information analysts, data administrators, data architects, enterprise architects, data warehouse engineers, and systems analysts, and their managers. Offers a comprehensive look at data quality for business and IT, encompassing people, process, and technology. Shows how to institute and run a data quality program, from first thoughts and justifications to maintenance and ongoing metrics. Includes an in-depth look at the use of data quality tools, including business case templates, and tools for analysis, reporting, and strategic planning.

Measuring Data Quality for Ongoing Improvement

Measuring Data Quality for Ongoing Improvement
Author: Laura Sebastian-Coleman
Publsiher: Newnes
Total Pages: 376
Release: 2012-12-31
ISBN 10: 0123977541
ISBN 13: 9780123977540
Language: EN, FR, DE, ES & NL

Measuring Data Quality for Ongoing Improvement Book Review:

The Data Quality Assessment Framework shows you how to measure and monitor data quality, ensuring quality over time. You’ll start with general concepts of measurement and work your way through a detailed framework of more than three dozen measurement types related to five objective dimensions of quality: completeness, timeliness, consistency, validity, and integrity. Ongoing measurement, rather than one time activities will help your organization reach a new level of data quality. This plain-language approach to measuring data can be understood by both business and IT and provides practical guidance on how to apply the DQAF within any organization enabling you to prioritize measurements and effectively report on results. Strategies for using data measurement to govern and improve the quality of data and guidelines for applying the framework within a data asset are included. You’ll come away able to prioritize which measurement types to implement, knowing where to place them in a data flow and how frequently to measure. Common conceptual models for defining and storing of data quality results for purposes of trend analysis are also included as well as generic business requirements for ongoing measuring and monitoring including calculations and comparisons that make the measurements meaningful and help understand trends and detect anomalies. Demonstrates how to leverage a technology independent data quality measurement framework for your specific business priorities and data quality challenges Enables discussions between business and IT with a non-technical vocabulary for data quality measurement Describes how to measure data quality on an ongoing basis with generic measurement types that can be applied to any situation

Data Quality Assessment

Data Quality Assessment
Author: Arkady Maydanchik
Publsiher: Technics Publications
Total Pages: 336
Release: 2007-04-01
ISBN 10: 163462047X
ISBN 13: 9781634620475
Language: EN, FR, DE, ES & NL

Data Quality Assessment Book Review:

Imagine a group of prehistoric hunters armed with stone-tipped spears. Their primitive weapons made hunting large animals, such as mammoths, dangerous work. Over time, however, a new breed of hunters developed. They would stretch the skin of a previously killed mammoth on the wall and throw their spears, while observing which spear, thrown from which angle and distance, penetrated the skin the best. The data gathered helped them make better spears and develop better hunting strategies. Quality data is the key to any advancement, whether it’s from the Stone Age to the Bronze Age. Or from the Information Age to whatever Age comes next. The success of corporations and government institutions largely depends on the efficiency with which they can collect, organize, and utilize data about products, customers, competitors, and employees. Fortunately, improving your data quality doesn’t have to be such a mammoth task. DATA QUALITY ASSESSMENT is a must read for anyone who needs to understand, correct, or prevent data quality issues in their organization. Skipping theory and focusing purely on what is practical and what works, this text contains a proven approach to identifying, warehousing, and analyzing data errors – the first step in any data quality program. Master techniques in: • Data profiling and gathering metadata • Identifying, designing, and implementing data quality rules • Organizing rule and error catalogues • Ensuring accuracy and completeness of the data quality assessment • Constructing the dimensional data quality scorecard • Executing a recurrent data quality assessment This is one of those books that marks a milestone in the evolution of a discipline. Arkady's insights and techniques fuel the transition of data quality management from art to science -- from crafting to engineering. From deep experience, with thoughtful structure, and with engaging style Arkady brings the discipline of data quality to practitioners. David Wells, Director of Education, Data Warehousing Institute

Data Stewardship

Data Stewardship
Author: David Plotkin
Publsiher: Newnes
Total Pages: 248
Release: 2013-09-16
ISBN 10: 0124104452
ISBN 13: 9780124104457
Language: EN, FR, DE, ES & NL

Data Stewardship Book Review:

Data stewards in business and IT are the backbone of a successful data governance implementation because they do the work to make a company’s data trusted, dependable, and high quality. Data Stewardship explains everything you need to know to successfully implement the stewardship portion of data governance, including how to organize, train, and work with data stewards, get high-quality business definitions and other metadata, and perform the day-to-day tasks using a minimum of the steward’s time and effort. David Plotkin has loaded this book with practical advice on stewardship so you can get right to work, have early successes, and measure and communicate those successes, gaining more support for this critical effort. Provides clear and concise practical advice on implementing and running data stewardship, including guidelines on how to organize based on company structure, business functions, and data ownership Shows how to gain support for your stewardship effort, maintain that support over the long-term, and measure the success of the data stewardship effort and report back to management Includes detailed lists of responsibilities for each type of data steward and strategies to help the Data Governance Program Office work effectively with the data stewards

The Decision Model

The Decision Model
Author: Barbara von Halle,Larry Goldberg
Publsiher: CRC Press
Total Pages: 553
Release: 2009-10-27
ISBN 10: 9781420082821
ISBN 13: 1420082825
Language: EN, FR, DE, ES & NL

The Decision Model Book Review:

In the current fast-paced and constantly changing business environment, it is more important than ever for organizations to be agile, monitor business performance, and meet with increasingly stringent compliance requirements. Written by pioneering consultants and bestselling authors with track records of international success, The Decision Model: A Business Logic Framework Linking Business and Technology provides a platform for rethinking how to view, design, execute, and govern business logic. The book explains how to implement the Decision Model, a stable, rigorous model of core business logic that informs current and emerging technology. The authors supply a strong theoretical foundation, while succinctly defining the path needed to incorporate agile and iterative techniques for developing a model that will be the cornerstone for continual growth. Because the book introduces a new model with tentacles in many disciplines, it is divided into three sections: Section 1: A Complete overview of the Decision Model and its place in the business and technology world Section 2: A Detailed treatment of the foundation of the Decision Model and a formal definition of the Model Section 3: Specialized topics of interest on the Decision Model, including both business and technical issues The Decision Model provides a framework for organizing business rules into well-formed decision-based structures that are predictable, stable, maintainable, and normalized. More than this, the Decision Model directly correlates business logic to the business drivers behind it, allowing it to be used as a lever for meeting changing business objectives and marketplace demands. This book not only defines the Decision Model and but also demonstrates how it can be used to organize decision structures for maximum stability, agility, and technology independence and provide input into automation design.

A Guide to the Project Management Body of Knowledge PMBOK Guide Seventh Edition and The Standard for Project Management RUSSIAN

A Guide to the Project Management Body of Knowledge  PMBOK   Guide      Seventh Edition and The Standard for Project Management  RUSSIAN
Author: Project Management Institute Project Management Institute
Publsiher: Project Management Institute
Total Pages: 368
Release: 2021-08-01
ISBN 10: 1628257008
ISBN 13: 9781628257007
Language: EN, FR, DE, ES & NL

A Guide to the Project Management Body of Knowledge PMBOK Guide Seventh Edition and The Standard for Project Management RUSSIAN Book Review:

PMBOK&® Guide is the go-to resource for project management practitioners. The project management profession has significantly evolved due to emerging technology, new approaches and rapid market changes. Reflecting this evolution, The Standard for Project Management enumerates 12 principles of project management and the PMBOK&® Guide &– Seventh Edition is structured around eight project performance domains.This edition is designed to address practitioners' current and future needs and to help them be more proactive, innovative and nimble in enabling desired project outcomes.This edition of the PMBOK&® Guide:•Reflects the full range of development approaches (predictive, adaptive, hybrid, etc.);•Provides an entire section devoted to tailoring the development approach and processes;•Includes an expanded list of models, methods, and artifacts;•Focuses on not just delivering project outputs but also enabling outcomes; and• Integrates with PMIstandards+™ for information and standards application content based on project type, development approach, and industry sector.

Journey to Data Quality

Journey to Data Quality
Author: Yang W. Lee,Leo L. Pipino,Richard Y. Wang,James D. Funk
Publsiher: Mit Press
Total Pages: 226
Release: 2009
ISBN 10: 9780262513357
ISBN 13: 0262513358
Language: EN, FR, DE, ES & NL

Journey to Data Quality Book Review:

A guide for assessing an organization's data quality practice and a roadmap for implementing a viable data and information quality management program, based on rigorous research and drawing on real-world examples. All organizations today confront data quality problems, both systemic and structural. Neither ad hoc approaches nor fixes at the systems leve--installing the latest software or developing an expensive data warehouse--solve the basic problem of bad data quality practices. Journey to Data Quality offers a roadmap that can be used by practitioners, executives, and students for planning and implementing a viable data and information quality management program. This practical guide, based on rigorous research and informed by real-world examples, describes the challenges of data management and provides the principles, strategies, tools, and techniques necessary to meet them. The authors, all leaders in the data quality field for many years, discuss how to make the economic case for data quality and the importance of getting an organization's leaders on board. They outline different approaches for assessing data, both subjectively (by users) and objectively (using sampling and other techniques). They describe real problems and solutions, including efforts to find the root causes of data quality problems at a healthcare organization and data quality initiatives taken by a large teaching hospital. They address setting company policy on data quality and, finally, they consider future challenges on the journey to data quality.

Ten Steps to a Results Based Monitoring and Evaluation System

Ten Steps to a Results Based Monitoring and Evaluation System
Author: Jody Zall Kusek,Ray C. Rist
Publsiher: World Bank Publications
Total Pages: 264
Release: 2004-06-15
ISBN 10: 0821389076
ISBN 13: 9780821389072
Language: EN, FR, DE, ES & NL

Ten Steps to a Results Based Monitoring and Evaluation System Book Review:

This Handbook provides a comprehensive ten-step model that will help guide development practitioners through the process of designing and building a results-based monitoring and evaluation system.

Data Quality

Data Quality
Author: Rupa Mahanti
Publsiher: Quality Press
Total Pages: 526
Release: 2019-03-18
ISBN 10: 0873899776
ISBN 13: 9780873899772
Language: EN, FR, DE, ES & NL

Data Quality Book Review:

“This is not the kind of book that you’ll read one time and be done with. So scan it quickly the first time through to get an idea of its breadth. Then dig in on one topic of special importance to your work. Finally, use it as a reference to guide your next steps, learn details, and broaden your perspective.” from the foreword by Thomas C. Redman, Ph.D., “the Data Doc” Good data is a source of myriad opportunities, while bad data is a tremendous burden. Companies that manage their data effectively are able to achieve a competitive advantage in the marketplace, while bad data, like cancer, can weaken and kill an organization. In this comprehensive book, Rupa Mahanti provides guidance on the different aspects of data quality with the aim to be able to improve data quality. Specifically, the book addresses: -Causes of bad data quality, bad data quality impacts, and importance of data quality to justify the case for data quality-Butterfly effect of data quality-A detailed description of data quality dimensions and their measurement-Data quality strategy approach-Six Sigma - DMAIC approach to data quality-Data quality management techniques-Data quality in relation to data initiatives like data migration, MDM, data governance, etc.-Data quality myths, challenges, and critical success factorsStudents, academicians, professionals, and researchers can all use the content in this book to further their knowledge and get guidance on their own specific projects. It balances technical details (for example, SQL statements, relational database components, data quality dimensions measurements) and higher-level qualitative discussions (cost of data quality, data quality strategy, data quality maturity, the case made for data quality, and so on) with case studies, illustrations, and real-world examples throughout.

Conducting Online Surveys

Conducting Online Surveys
Author: Valerie M. Sue,Lois A. Ritter
Publsiher: SAGE
Total Pages: 242
Release: 2012
ISBN 10: 1412992257
ISBN 13: 9781412992251
Language: EN, FR, DE, ES & NL

Conducting Online Surveys Book Review:

This book addresses the needs of researchers who want to conduct surveys online. Issues discussed include sampling from online populations, developing online and mobile questionnaires, and administering electronic surveys, are unique to digital surveys. Others, like creating reliable and valid survey questions, data analysis strategies, and writing the survey report, are common to all survey environments. This single resource captures the particulars of conducting digital surveys from start to finish.

Computational Genomics with R

Computational Genomics with R
Author: Altuna Akalin
Publsiher: CRC Press
Total Pages: 440
Release: 2020-12-16
ISBN 10: 1498781861
ISBN 13: 9781498781862
Language: EN, FR, DE, ES & NL

Computational Genomics with R Book Review:

Computational Genomics with R provides a starting point for beginners in genomic data analysis and also guides more advanced practitioners to sophisticated data analysis techniques in genomics. The book covers topics from R programming, to machine learning and statistics, to the latest genomic data analysis techniques. The text provides accessible information and explanations, always with the genomics context in the background. This also contains practical and well-documented examples in R so readers can analyze their data by simply reusing the code presented. As the field of computational genomics is interdisciplinary, it requires different starting points for people with different backgrounds. For example, a biologist might skip sections on basic genome biology and start with R programming, whereas a computer scientist might want to start with genome biology. After reading: You will have the basics of R and be able to dive right into specialized uses of R for computational genomics such as using Bioconductor packages. You will be familiar with statistics, supervised and unsupervised learning techniques that are important in data modeling, and exploratory analysis of high-dimensional data. You will understand genomic intervals and operations on them that are used for tasks such as aligned read counting and genomic feature annotation. You will know the basics of processing and quality checking high-throughput sequencing data. You will be able to do sequence analysis, such as calculating GC content for parts of a genome or finding transcription factor binding sites. You will know about visualization techniques used in genomics, such as heatmaps, meta-gene plots, and genomic track visualization. You will be familiar with analysis of different high-throughput sequencing data sets, such as RNA-seq, ChIP-seq, and BS-seq. You will know basic techniques for integrating and interpreting multi-omics datasets. Altuna Akalin is a group leader and head of the Bioinformatics and Omics Data Science Platform at the Berlin Institute of Medical Systems Biology, Max Delbrück Center, Berlin. He has been developing computational methods for analyzing and integrating large-scale genomics data sets since 2002. He has published an extensive body of work in this area. The framework for this book grew out of the yearly computational genomics courses he has been organizing and teaching since 2015.

Predictive Analytics For Dummies

Predictive Analytics For Dummies
Author: Dr. Anasse Bari,Mohamed Chaouchi,Tommy Jung
Publsiher: John Wiley & Sons
Total Pages: 360
Release: 2014-03-06
ISBN 10: 1118729412
ISBN 13: 9781118729410
Language: EN, FR, DE, ES & NL

Predictive Analytics For Dummies Book Review:

Combine business sense, statistics, and computers in a new and intuitive way, thanks to Big Data Predictive analytics is a branch of data mining that helps predict probabilities and trends. Predictive Analytics For Dummies explores the power of predictive analytics and how you can use it to make valuable predictions for your business, or in fields such as advertising, fraud detection, politics, and others. This practical book does not bog you down with loads of mathematical or scientific theory, but instead helps you quickly see how to use the right algorithms and tools to collect and analyze data and apply it to make predictions. Topics include using structured and unstructured data, building models, creating a predictive analysis roadmap, setting realistic goals, budgeting, and much more. Shows readers how to use Big Data and data mining to discover patterns and make predictions for tech-savvy businesses Helps readers see how to shepherd predictive analytics projects through their companies Explains just enough of the science and math, but also focuses on practical issues such as protecting project budgets, making good presentations, and more Covers nuts-and-bolts topics including predictive analytics basics, using structured and unstructured data, data mining, and algorithms and techniques for analyzing data Also covers clustering, association, and statistical models; creating a predictive analytics roadmap; and applying predictions to the web, marketing, finance, health care, and elsewhere Propose, produce, and protect predictive analytics projects through your company with Predictive Analytics For Dummies.

Spark in Action

Spark in Action
Author: Jean-Georges Perrin
Publsiher: Simon and Schuster
Total Pages: 576
Release: 2020-05-12
ISBN 10: 1638351309
ISBN 13: 9781638351306
Language: EN, FR, DE, ES & NL

Spark in Action Book Review:

Summary The Spark distributed data processing platform provides an easy-to-implement tool for ingesting, streaming, and processing data from any source. In Spark in Action, Second Edition, you’ll learn to take advantage of Spark’s core features and incredible processing speed, with applications including real-time computation, delayed evaluation, and machine learning. Spark skills are a hot commodity in enterprises worldwide, and with Spark’s powerful and flexible Java APIs, you can reap all the benefits without first learning Scala or Hadoop. Purchase of the print book includes a free eBook in PDF, Kindle, and ePub formats from Manning Publications. About the technology Analyzing enterprise data starts by reading, filtering, and merging files and streams from many sources. The Spark data processing engine handles this varied volume like a champ, delivering speeds 100 times faster than Hadoop systems. Thanks to SQL support, an intuitive interface, and a straightforward multilanguage API, you can use Spark without learning a complex new ecosystem. About the book Spark in Action, Second Edition, teaches you to create end-to-end analytics applications. In this entirely new book, you’ll learn from interesting Java-based examples, including a complete data pipeline for processing NASA satellite data. And you’ll discover Java, Python, and Scala code samples hosted on GitHub that you can explore and adapt, plus appendixes that give you a cheat sheet for installing tools and understanding Spark-specific terms. What's inside Writing Spark applications in Java Spark application architecture Ingestion through files, databases, streaming, and Elasticsearch Querying distributed datasets with Spark SQL About the reader This book does not assume previous experience with Spark, Scala, or Hadoop. About the author Jean-Georges Perrin is an experienced data and software architect. He is France’s first IBM Champion and has been honored for 12 consecutive years. Table of Contents PART 1 - THE THEORY CRIPPLED BY AWESOME EXAMPLES 1 So, what is Spark, anyway? 2 Architecture and flow 3 The majestic role of the dataframe 4 Fundamentally lazy 5 Building a simple app for deployment 6 Deploying your simple app PART 2 - INGESTION 7 Ingestion from files 8 Ingestion from databases 9 Advanced ingestion: finding data sources and building your own 10 Ingestion through structured streaming PART 3 - TRANSFORMING YOUR DATA 11 Working with SQL 12 Transforming your data 13 Transforming entire documents 14 Extending transformations with user-defined functions 15 Aggregating your data PART 4 - GOING FURTHER 16 Cache and checkpoint: Enhancing Spark’s performances 17 Exporting data and building full data pipelines 18 Exploring deployment

Data Science on AWS

Data Science on AWS
Author: Chris Fregly,Antje Barth
Publsiher: "O'Reilly Media, Inc."
Total Pages: 524
Release: 2021-04-07
ISBN 10: 1492079367
ISBN 13: 9781492079361
Language: EN, FR, DE, ES & NL

Data Science on AWS Book Review:

With this practical book, AI and machine learning practitioners will learn how to successfully build and deploy data science projects on Amazon Web Services. The Amazon AI and machine learning stack unifies data science, data engineering, and application development to help level upyour skills. This guide shows you how to build and run pipelines in the cloud, then integrate the results into applications in minutes instead of days. Throughout the book, authors Chris Fregly and Antje Barth demonstrate how to reduce cost and improve performance. Apply the Amazon AI and ML stack to real-world use cases for natural language processing, computer vision, fraud detection, conversational devices, and more Use automated machine learning to implement a specific subset of use cases with SageMaker Autopilot Dive deep into the complete model development lifecycle for a BERT-based NLP use case including data ingestion, analysis, model training, and deployment Tie everything together into a repeatable machine learning operations pipeline Explore real-time ML, anomaly detection, and streaming analytics on data streams with Amazon Kinesis and Managed Streaming for Apache Kafka Learn security best practices for data science projects and workflows including identity and access management, authentication, authorization, and more

Big Data and Social Science

Big Data and Social Science
Author: Ian Foster,Rayid Ghani,Ron S. Jarmin,Frauke Kreuter,Julia Lane
Publsiher: CRC Press
Total Pages: 391
Release: 2020-11-18
ISBN 10: 100020863X
ISBN 13: 9781000208634
Language: EN, FR, DE, ES & NL

Big Data and Social Science Book Review:

Big Data and Social Science: Data Science Methods and Tools for Research and Practice, Second Edition shows how to apply data science to real-world problems, covering all stages of a data-intensive social science or policy project. Prominent leaders in the social sciences, statistics, and computer science as well as the field of data science provide a unique perspective on how to apply modern social science research principles and current analytical and computational tools. The text teaches you how to identify and collect appropriate data, apply data science methods and tools to the data, and recognize and respond to data errors, biases, and limitations. Features Takes an accessible, hands-on approach to handling new types of data in the social sciences Presents the key data science tools in a non-intimidating way to both social and data scientists while keeping the focus on research questions and purposes Illustrates social science and data science principles through real-world problems Links computer science concepts to practical social science research Promotes good scientific practice Provides freely available data and code as well as practical programming exercises through Binder and GitHub New to the Second Edition Increased use of examples from different areas of social sciences New chapter on dealing with Bias and Fairness in Machine Learning models Expanded chapters focusing on Machine Learning and Text Analysis Revamped hands-on Jupyter notebooks to reinforce concepts covered in each chapter This classroom-tested book fills a major gap in graduate- and professional-level data science and social science education. It can be used to train a new generation of social data scientists to tackle real-world problems and improve the skills and competencies of applied social scientists and public policy practitioners. It empowers you to use the massive and rapidly growing amounts of available data to interpret economic and social activities in a scientific and rigorous manner.

Statistics in a Nutshell

Statistics in a Nutshell
Author: Sarah Boslaugh
Publsiher: "O'Reilly Media, Inc."
Total Pages: 569
Release: 2012-11-15
ISBN 10: 1449316824
ISBN 13: 9781449316822
Language: EN, FR, DE, ES & NL

Statistics in a Nutshell Book Review:

A clear and concise introduction and reference for anyone new to the subject of statistics.

Data Governance

Data Governance
Author: John Ladley
Publsiher: Academic Press
Total Pages: 350
Release: 2019-11-08
ISBN 10: 0128158328
ISBN 13: 9780128158326
Language: EN, FR, DE, ES & NL

Data Governance Book Review:

Managing data continues to grow as a necessity for modern organizations. There are seemingly infinite opportunities for organic growth, reduction of costs, and creation of new products and services. It has become apparent that none of these opportunities can happen smoothly without data governance. The cost of exponential data growth and privacy / security concerns are becoming burdensome. Organizations will encounter unexpected consequences in new sources of risk. The solution to these challenges is also data governance; ensuring balance between risk and opportunity. Data Governance, Second Edition, is for any executive, manager or data professional who needs to understand or implement a data governance program. It is required to ensure consistent, accurate and reliable data across their organization. This book offers an overview of why data governance is needed, how to design, initiate, and execute a program and how to keep the program sustainable. This valuable resource provides comprehensive guidance to beginning professionals, managers or analysts looking to improve their processes, and advanced students in Data Management and related courses. With the provided framework and case studies all professionals in the data governance field will gain key insights into launching successful and money-saving data governance program. Incorporates industry changes, lessons learned and new approaches Explores various ways in which data analysts and managers can ensure consistent, accurate and reliable data across their organizations Includes new case studies which detail real-world situations Explores all of the capabilities an organization must adopt to become data driven Provides guidance on various approaches to data governance, to determine whether an organization should be low profile, central controlled, agile, or traditional Provides guidance on using technology and separating vendor hype from sincere delivery of necessary capabilities Offers readers insights into how their organizations can improve the value of their data, through data quality, data strategy and data literacy Provides up to 75% brand-new content compared to the first edition