In this module, you will learn about converting data from one format to another, data validation and adjust the data for our analysis. One of the ways to help ensure that you have an accurate analysis of your data is by putting all of it in the correct format. This is true even if you have already cleaned and processed…
Course 5: Analyse Data to Answer Questions, Module 1: Organise data for more effective analysis
Organising data makes the data easier to use in your analysis. In this part of the course, We’ll learn the importance of organising your data through sorting and filtering. We’ll explore these processes in both spreadsheets and SQL as you continue to prepare your data. Learning Objectives Course content Course 5 – Analyse Data to Answer Questions Module 1: Organise…
Course 4: Process Data from Dirty to Clean, Module 4: Verify and Report Results
When we clean data, you make changes to the original dataset. It’s important to verify the changes we make are accurate and to let your teammates know about the changes. In this part of the course, we’ll learn to verify that data is clean and report our data cleaning results. With verified clean data, we are ready to begin analysing!…
Course 4: Process Data from Dirty to Clean, Module 3: SQL
Knowing a variety of ways to clean data can make a data analyst’s job much easier. Learning Objectives How a junior data analyst uses SQL In this reading, you will learn more about how to decide when to use SQL, or Structured Query Language. As a data analyst, you will be tasked with handling a lot of data, and SQL…
Course 4: Process Data from Dirty to Clean, Module 2: Clean it up
What is dirty data? Earlier, we discussed that dirty data is data that is incomplete, incorrect, or irrelevant to the problem you are trying to solve. This section summarizes: Types of dirty data Duplicate data Description Possible causes Potential harm to businesses Any data record that shows up more than once Manual data entry, batch data imports, or data migration…
Course 4: Process Data from Dirty to Clean, Module 1: The importance of integrity
Scenario: calendar dates for a global company Calendar dates are represented in a lot of different short forms. Depending on where you live, a different format might be used. Now, think about what would happen if you were working as a data analyst for a global company and didn’t check date formats. Well, your data integrity would probably be questionable.…
Course 4: Process Data from Dirty to Clean: Overview
This course is the fourth in the Google Data Analytics Certificate program. It will teach you how to clean data using spreadsheets and SQL, as well as how to verify and report your data cleaning results. This is an important skill for data analysts, as it ensures that the data they are working with is accurate and reliable. Here are…
Course 3: Prepare Data For Exploration, Module 4: Organise and Secure Data
File organisation guidelines Every data analyst’s goal is to conduct efficient data analysis. One way to increase the efficiency of your analyses is to streamline processes that help save time and energy in the long run. Meaningful, logical, and consistent file names help data analysts organise their data and automate their analysis process. When you use consistent guidelines to describe…
Course 3: Prepare Data For Exploration, Module 3: Database Essentials
Maximise databases in data analytics Databases enable analysts to manipulate, store, and process data. This helps them search through data a lot more efficiently to get the best insights. Relational databases A relational database is a database that contains a series of tables that can be connected to form relationships. Basically, they allow data analysts to organise and link data…
Course 3: Prepare Data For Exploration, Module 2: Data responsibility
Data Responsibility Rundown Key Learnings: Specific Topics Covered: Data anonymization What is data anonymization? We have been learning about the importance of privacy in data analytics. Now, it is time to talk about data anonymization and what types of data should be anonymized. Personally identifiable information, or PII, is information that can be used by itself or with other data to…