- Foundation Course Module 1: Introduction of data analytics and analytical thinking
- Foundation Course Module 2 : The wonderful world of data
- Foundation Course Module 3 : Setup of data analytics toolbox
- Foundation Course Module 4: Becoming a fair and impactful data professional
- Foundation Course: Glossary
- Course 2: Ask questions to make data driven decisions, Module 1: Ask effective questions
- Course 2: Ask questions to make data driven decisions, Module 2: Make data-driven decisions
- Course 2: Ask questions to make data driven decisions, Module 3: Spreadsheet magic
- Course 2: Ask questions to make data driven decisions, Module 4: Always remember the stakeholder
- Course 3: Prepare Data For Exploration: Learning objectives and overviews
- Course 3: Prepare Data For Exploration, Module 1: Data types and structures
- Course 3: Prepare Data For Exploration, Module 2: Data responsibility
- Course 3: Prepare Data For Exploration, Module 3: Database Essentials
- Course 3: Prepare Data For Exploration, Module 4: Organise and Secure Data
- Course 4: Process Data from Dirty to Clean: Overview
- Course 4: Process Data from Dirty to Clean, Module 1: The importance of integrity
- Course 4: Process Data from Dirty to Clean, Module 2: Clean it up
- Course 4: Process Data from Dirty to Clean, Module 3: SQL
- Course 4: Process Data from Dirty to Clean, Module 4: Verify and Report Results
- Course 5: Analyse Data to Answer Questions, Module 1: Organise data for more effective analysis
- Course 5: Analyse Data to Answer Questions, Module 2: Format and adjust data
- Course 5: Analyse Data to Answer Questions, Module 3: Aggregate data for analysis
- Course 5: Analyse Data to Answer Questions, Module 4: Perform Data Calculations
- Course 6: Share Data Through the Art of Visualisation, Course Overview plus Module 1: Visualise Data
- Course 6: Share Data Through the Art of Visualisation, Course Overview plus Module 2: Create Data Visualisation with Tableau
Organising data makes the data easier to use in your analysis. In this part of the course, We’ll learn the importance of organising your data through sorting and filtering. We’ll explore these processes in both spreadsheets and SQL as you continue to prepare your data.
Learning Objectives
- Describe what is involved in the data analysis process with reference to goals and key tasks
- Discuss the importance of organizing data before analysis with references to sorts and filters
- Describe sorting as it relates to data in a spreadsheet or database with reference to functionality and benefits
- Recall the steps involved in sorting and filtering data through the use of SQL queries
Course content
Course 5 – Analyse Data to Answer Questions
- Organise data to begin analysis. Organizing data makes the data easier to use in an analysis. In this part of the course, you will learn the importance of organizing your data with sorting and filtering. You will explore organizing data in both spreadsheets and with SQL queries and temporary tables.
- Format and adjust data. As you move closer to analysing your data, you will want to have the data formatted and ready to go. In this part of the course, you will learn all about converting and formatting data, including how to use SQL queries to combine data. You will also discover the value of feedback and support from your colleagues and how it can lead to new insights that you can apply to your work.
- Aggregate data for analysis. During an analysis, you might need to combine data to gain insights and complete business objectives. In this part of the course, you will explore the functions, procedures, and syntax to combine, or aggregate data. You will learn how to combine data within multiple cells in spreadsheets, and within multiple database tables using SQL queries.
- Perform data calculations. Calculations are one of the more common tasks that data analysts perform during an analysis. In this part of the course, you will explore formulas, functions, and pivot tables in spreadsheets and SQL queries. All of these are used in data calculations. You will also learn about the benefits of using SQL to manage temporary database tables.
Module 1: Organise data to begin analysis
Sort and filter data to keep it organised
The first two phases of data analysis, organise data and format and adjust data, are important for data analysts because they can use these phases to manipulate their data in ways that make important patterns and trends more obvious. Most of the datasets you’ll use as a data analyst will be organised as tables. Tables are helpful because they let you manipulate and categorize your data. Having distinct categories and classifications lets you focus on, and differentiate between, the groups in your data quickly and easily.
Sorting and filtering are two methods you can use to organise, format, and adjust data. For example, a filter can help you find errors or outliers so you can fix or flag them before your analysis. Outliers are data points that are very different from similarly collected data and might not be reliable values. The benefit of filtering the data is that after you fix errors or identify outliers, you can remove the filter and return the data to its original organisation.
Sort data
Sorting is the process of arranging data into a meaningful order to make it easier to understand, analyse, and visualise. It ranks your data based on a specific metric you choose. You can sort data in spreadsheets, SQL databases (when your dataset is too large for spreadsheets), and tables in documents.
To rank items or create chronological lists, you can sort by ascending or descending order. Sorting arranges the data in a meaningful way and gives you immediate insights. Sorting also helps you to group similar data together by a classification. For example, if a ski resort design company wants to evaluate the resorts designed by a competitor, a data analyst can sort competitive resorts by locations, runs, acreage, and other factors. This way, the firm’s designers can visit the types of resorts they also design and gather information that could be used in its own future designs.
Filter data
Sometimes, an analysis may require only a subset of the data in your dataset. You can use a filter to show only the data that meets a specified criterion while hiding the rest. Filtering is useful when you have lots of data. You can save time by zeroing in on the data that’s important for your current analysis or the data that contains errors. Most spreadsheets and SQL databases allow you to filter your data in a variety of ways. Filtering gives you the ability to find what you are looking for without too much effort.
For example, if the ski resort design company wants to inspect specific criteria for the competitive ski resorts they intend to visit and evaluate, a data analyst can filter the competitive resort database to extract information about the number of runs compared to acreage to identify design trends or other insights.
Sort a pivot table
A pivot table is a data summarisation tool used to sort, reorganise, group, count, total, or average data. Items in the row and column areas of a pivot table are sorted in ascending order by any custom list first. If the items aren’t in a custom list, they will be sorted in ascending order by default. But, if you sort in descending order, you are setting up a rule that controls how the field is sorted even after new data points are added. For example, in the ski resort dataset, the pivot table allows locations to be sorted alphabetically by state, territory, or country.
Data analysts filter and sort data to organise it for better understanding, analysis, and visualization. Sorting arranges data in a meaningful order, while filtering displays only data that meets specific criteria. Combining filtering and sorting allows for organizing only relevant data for analysis. Both spreadsheets and SQL databases allow for data filtering and sorting data.
Sort and filter in Sheets
Sorting in Google Sheets helps you quickly spot trends in numbers and text. For example, as the vice president of sales at a candy company, you want to improve chocolate sales in lower-performing regions—your company makes delicious chocolate and you know sales can improve. As a first step, you examine this by calculating gross (total) revenue of chocolate by sales region. In this case, you could sort the gross revenue column in descending (Z to A) order to find the top performing regions at the top, or sort the gross revenue column in ascending (A to Z) order to find the lowest performing regions at the top. Then, you can look at patterns in the best and worst regions to explore how to increase sales in the lower-performing regions.
If you want to learn more about the set menu options for sorting and filtering, start with these resources:
- Sort and filter data (Google Help Center): instructions to sort data in alphabetical or numerical order and create filter views
- Sort data by selecting a range of data in a column: video of steps to achieve the task
- Sort a range of data using sort criteria for multiple columns: technical tip instructions by SheetGo company to sort data across multiple columns
In addition to the standard menu options, the SORT function allows you to do more advanced sorting. Use this function to create custom sorting rules. You can sort the rows of a given range of data by the values in one or more columns. And you can set the sort criteria for each column. Refer to the SORT function page for the syntax.
Like the SORT function, use the FILTER function to filter by any matching criteria you like. This creates a custom filter.
As you’ve learned, you can filter data and then sort the filtered results. Using the FILTER and SORT functions together in a range of cells can programmatically and automatically achieve these results for you.
Sort and filter in Excel
You can also sort in ascending (A to Z) and descending (Z to A) order in Microsoft Excel. Excel offers Smallest to Largest and Largest to Smallest sorting options when you’re working with numbers.
Similar to the SORT function in Google Sheets, Excel includes custom sort capabilities that are available from the menu. After you select the data range, click the Sort & Filter button to select the criteria for sorting. You can even sort by the data in rows instead of by the data in columns if you select Sort left to right under Options. (Sort top to bottom is the default setting to sort the data in columns.)
If you want to learn more about sorting and filtering in Excel, start with these resources:
- Sort data in a range or table (Microsoft Support): instructions to perform sorting in a variety of use cases
- Excel training: sort and filter data (Microsoft Support): sorting and filtering videos with transcripts
- Excel: sorting data: video of how to use the Sort & Filter and Data menu options for sorting
Excel also has SORT, SORTBY, and FILTER functions. Explore how you can use these functions to automatically sort and filter your data in spreadsheets without having to select any menu options at all.
Sort and filter manually with menus and buttons
Both Sheets and Excel feature menu options designed to let you sort and filter without using functions. For example, after selecting the data you’d like to sort in Google Sheets, you can choose Data > Sort sheet or Data > Sort range to sort that data. To filter the data, select all the columns and rows and choose Data > Create a filter. In Excel, you can use the Sort & filter button to bring up a user-friendly interface that guides you through sorting or filtering.
Finally, when using menus or buttons, here are a couple of best practices:
- Back up or make a copy of your data before making major changes.
- When filtering data, keep in mind that other users may also be accessing the spreadsheet. For example: Filters in Google Sheets can affect all viewers, so you should use filter views for personal filtering.