A data analyst is working with a data set and would like to combine two fields into a single field. Which of the following data manipulation techniques should the analyst use?
Which of the following query statements would be used when filtering data in a relational database management system? (Select two).
An analyst needs to determine the appropriate data type for the following sample data:
sample data collected:
Which of the following data types should be used for this data?
An analyst modified a data set that had a number of issues. Given the original and modified versions:
Which of the following data manipulation techniques did the analyst use?
A reporting analyst is creating a dashboard that shows the year-over-year performance for a sales organization. Which of the following is the best visual for the analyst use to illustrate the organization's performance?
A data analyst has been asked to derive a new variable labeled “Promotion_flag” based on the total quantity sold by each salesperson. Given the table below:
Which of the following functions would the analyst consider appropriate to flag “Yes” for every salesperson who has a number above 1,000,000 in the Quantity_sold column?
An analyst is reporting on the average income for a county and is reviewing the following data:
Which of the following is the reason the analyst would need to cleanse the data in this data set?
An analyst is working with the income data of suburban families in the United States. The data set has a lot of outliers, and the analyst needs to provide a measure that represents the typical income. Which of the following would BEST fulfill the analyst’s goal?
A development company is constructing a new unit in its apartment complex. The complex has the following floor plans:
Using the average cost per square foot of the original floor plans, which of the following should be the price of the Rose unit?
What would be an example of an acceptable form of primary identification for the Data+ exam?
A site reliability team wants to monitor the stability of their website. so they can proactively diagnose issues when they occur Which of the following deliverables would best suit their needs?
A data analyst must separate the column shown below into multiple columns for each component of the name:
Which of the following data manipulation techniques should the analyst perform?
A data analyst needs to perform a full outer join of a customer's orders using the tables below:
Which of the following is the mean of the order quantity?
Which of the following terms best describes a situation in which a rating scale does not conform to previously agreed-upon requirements?
Andy is a pricing analyst for a retailer. Using a hypothesis test, he wants to assess whether people who receive electronic coupons spend more on average.
What should Andy's null hypothesis be?
A data analyst is creating a report that will provide information about various regions, products, and time periods. Which of the following formats would be themost efficient way to deliver this report?
Given the following data table:
Which of the following are appropriate reasons to undertake data cleansing? (Select two).
A data analyst has been asked to merge the tables below, first performing an INNER JOIN and then a LEFT JOIN:
Customer Table -
In-store Transactions –
Which of the following describes the number of rows of data that can be expected after performing both joins in the order stated, considering the customer table as the main table?
An analyst wants to combine two data sets into a single spreadsheet. Column names from the first spreadsheet are listed in rows in the second spreadsheet. Which of the following is the first step the analyst should take to combine the data sets?
Which of the following data types would a telephone number formatted as XXX-XXX-XXXX be considered?
Which of the following file formats is best suited to start exploratory analysis within statistical software?
Which of the following is an object associated with a table that sorts and stores table row data in a key-value pair?
During data profiling, an analyst decides to recode the status column in the following data set:
Which of the following data concerns explains why the analyst wants to take this action?
Which of the following is a domain-specific language used in programming that is designed for managing data that is held in a relational data stream management system?
After completing web scraping, which of the following file formats needs to be parsed?
An analyst wants to extract data from a variety of sources and store the data in a cloud-based environment prior to cleaning. Which of the following integration techniques should the analyst use?
A data analyst has a set with more than 40.000 rows in the sample schema below:
The analyst would like to create one column that contains the customers’ birth dates. Which of the following data quality dimensions would BEST explain the reason for compilation?
A reporting analyst needs to create a report that refreshes automatically and is accessible to the entire sales organization. Which of the following tools is the most appropriate to use for this task?
A company wants to know how its customers interact with an e-commerce website based on clicks over items. Which of the following is the primary requirement for this report?
A sales team wants visibility of current sales numbers, pipeline, and team performance. The team would also like to see calculations of individuals’ earned commissions and projected commissions based on sales, but they want that information to be kept confidential. Which of the following would be the BEST way to provide this visibility?
A business intelligence engineer needs to reduce the size of a data model for reporting purposes. The data set contains more than one million rows, and the table has a date-time column named Date. Which of the following should the analyst do to complete this task?
An analyst is working on a project for a director. During this process. the analyst pulled the data. created summarized tables and graphs with descriptions, created a report summary, and inserted all items into a report. After writing the report, which of the following would be the most appropriate next step?
An analyst is designing a dashboard that will provide a story of the sales and sales customer ratio. The following data is available:
Which of the following charts should the analyst consider including in the dashboard?
Which of the following explains why standardization of data field names is important to master data management concepts?
Given the table below:
Which of the following boxes indicates that a Type Il error has occurred?
Different people manually type a series of handwritten surveys into an online database. Which of the following issues will MOST likely arise with this data? (Choose two.)
A company notifies its employees that emails will be automatically moved to a cloud-based server in 180 days. Which of the following describes this concept?
An analyst is creating a resource to improve users' experience when they select specific records based on particular dates. Which of the following should the analyst use to create a resource that best meets user needs?
An employer needs to maintain adequate office staffing during the winter and wants to track storm data. Which of the following data collection methods should the employer use?
A healthcare data analyst notices that one data set in the column for BloodPressure contains several outliers that need to be replaced with meaningful values. Which of the following data manipulation techniques should the analyst use?
A data analyst was asked to create a visual representation of sales for the first quarter of 2020. Which of the following visualizations should be used when a time element is present?
A data architect is designing a data solution for a retail clothing store chain. Each store has a database that tracks sales transactions. The data architect needs to create a summary table that will be used for a senior executive dashboard. The summary table should not contain duplicate store information. Which of the following should the data architect create?
Given the following table of student scores (with some values that violate the allowed scoring rules), which of the following is the best reason for cleansing the data?
Which one of the following would not normally be considered a summary statistic?
A county in Illinois is conducting a survey to determine the mean annual income per household. The county is 427sq mi (2.65q km). Which of the following sampling methods would MOST likely result in a representative sample?
Which of the following defines the policies and procedures for managing the master data?
An e-commerce company recently tested a new website layout. The website was tested by a test group of customers, and an old website was presented to a control group. The table below shows the percentage of users in each group who made purchases on the websites:
Which of the following conclusions is accurate at a 95% confidence interval?
Which of the following data manipulation techniques is an example of a logical function?
Which of the following best describes a business analytics tool with interactive visualization and business capabilities and an interface that is simple enough for end users to create their own reports and dashboards?
Python
Daniel is using the structured Query language to work with data stored in relational database.
He would like to add several new rows to a database table.
What command should he use?
Which of the following technologies would be best suited for creating a multiple linear regression model?
The number of phone calls that the call center receives in a day is an example of:
An analyst for a small business with multiple locations is using each location’s quarterly sales reports from last year to create a single revenue report for the year. Which of the following data mining techniques should the analyst use to complete this task?
A quality assurance manager is examining tolerances in Internet of Things sensors. Which of the following is the best measure for the manager to calculate?
Consider this dataset showing the retirement age of 11 people, in whole years:
54, 54, 54, 55, 56, 57, 57, 58, 58, 60, 60
This tables show a simple frequency distribution of the retirement age data.
Which of the following best describes the process of examining data for statistics and information about the data?
Cleansing
A data analyst is performing a data merge within a spreadsheet using the tables below:
https://www.bing.com/images/blob?bcid=S1XCF9p02M4GjpbGxHj0lrIaj9sw.....4c
The analyst is attempting to pull the addresses from Table 2 into Table 1 using the last names and is receiving an error message. Which of the following steps can the analyst perform to fix the error?
Which of the following variable name formats would be problematic if used in the majority of data software programs?
A data analyst is compiling a report that a Chief Executive Officer needs for an impromptu meeting. The report should include information on the previous day's performance. Which of the following reports should the analyst provide?
Joseph is interpreting a left skewed distribution of test scores. Joe scored at the mean, Alfonso scored at the median, and gaby scored and the end of the tail.
Who had the highest score?
Under which of the following circumstances should the null hypothesis be accepted when a = 0.05?
Which of the following types of analyses is best to use when tracking sales revenue against quarterly targets?
Encryption is a mechanism for protecting data.
When should encryption be applied to data?
Choose the best answer.
A data analyst for a media company needs to determine the most popular movie genre. Given the table below:
Which of the following must be done to the Genre column before this task can be completed?
A database consists of one fact table that is composed of multiple dimensions. Each dimension is represented by a denormalized table. This structure is an example of a:
A user imports a data file into the accounts payable system each day. On a regular basis. the field input is not what the system is expecting. so it results in an error for the row and a broken import process. To resolve the issue, the user opens the file, finds the error in the row, and manually corrects it before attempting the import again. The import sometimes breaks on subsequent attempts. though. Which of the following changes should be made to this process to reduce the number of errors?
A data analyst reviews the following data set:
Which of the following is the range value?
Which of the following data governance concepts fits into the security requirements category?
Which of the following BEST describes the issue in which character values are mixed with integer values in a data set column?
An organization would like to add a secondary email field to its customer database in order toenrich the customer profiles. Which of the following data manipulation techniques should the analyst use to add this information?
Which of the following is a process that is used during data integration to collect, blend, and load data?
A sales analyst needs to report how the sales team is performing to target. Which of the following files will be important in determining 2019 performance attainment?
An analyst wants to determine whether a relationship between an individual's age and voting preferences exists. Which of the following is the best statistical method for the analyst to use?
A data analyst is asked to create a sales report for the second-quarter 2020 board meeting, which will include a review of the business’s performance through the second quarter. The board meeting will be held on July 15, 2020, after the numbers are finalized. Which of the following report types should the data analyst create?
Which of the following data sampling methods involves dividing a population into subgroups by similar characteristics?
Randy scored 76 on a math test, Katie scored 86 on a science test, Ralph scored 80 on a history test, and Jean scored 80 on an English test. The table below contains the mean and standard deviation of the scores for each of the courses:
Using this information, which of the following students had the BEST score?
A data analyst for a media company needs to determine the most popular movie genre. Given the table below:
Which of the following must be done to the Genre column before this task can be completed?
Which one of the following programming languages is specifically designed for use in analytics applications?
A column is being used to store strings of variable lengths. Performance is a concern, so the column needs to use as little space as possible. Which of the following data types best meets these requirements?
Given the following graph:
Which of the following summary statements upholds integrity in data reporting?
Given the table below:
Which of the following variable types BEST describes the “Year” column?
Which of the following is the first step an analyst should perform upon receiving a business request for analysis?
An analyst wants to include a graph in a quarterly sales report that shows the comparison between two quantitative variables. Which of the following visual diagrams can the analyst use to most effectively represent this relationship?
A data analyst needs to present the results of an online marketing campaign to the marketing manager. The manager wants to see the most important KPIs and measure the return on marketing investment. Which of the following should the data analyst use to BEST communicate this information to the manager?
An analyst is compiling a series of reports for the new executive board to review. Which of the following elements provides a snapshot of what is contained in the reports for the executives who do not have time to focus on the details?
The current date is July 14, 2020. A data analyst has been asked to create a report that shows the company’s year-over-year Q2 2020 sales. Which of the following reports should the analyst compare?
Which of the following describes the use of a representative amount of data from a main repository?
The director of operations at a power company needs data to help identify where company resources should be allocated in order to monitor activity for outages and restoration of power in the entire state. Specifically, the director wants to see the following:
* County outages
* Status
* Overall trend of outages
INSTRUCTIONS:
Please, select each visualization to fit the appropriate space on the dashboard and choose an appropriate color scheme. Once you have selected all visualizations, please, select the appropriate titles and labels, if applicable. Titles and labels may be used more than once.
If at any time you would like to bring back the initial state of the simulation, please click the Reset All button.
A data analyst needs to create a master file that includes customer information from the tables below:
Given the three tables above, the analyst wants to filter down the information prior to joining it together. In which of the following orders should this data manipulation bo approached for the most efficient result?
A web developer wants to ensure that malicious users can't type SQL statements when they asked for input, like their username/userid.
Which of the following query optimization techniques would effectively prevent SQL Injection attacks?
Given the customer table below:
Which of the following chart types is the most appropriate to represent the average spending of active customers vs. inactive customers?
A business intelligence team wants to create a new dashboard in order to solve a problem statement. Which of the following is the correct order of steps the team should take?
A publishing group has requested a dashboard to track submissions before publication. A key requirement is that all changes are tracked, as multiple users will be checking out documents and editing them before submissions are considered final. Which of the following is the BEST way to meet this stakeholder requirement?
An analyst is currently working on a ticket to revamp a company-wide dashboard that has been in use for five years. Which of the following should be the first step in the development process?
The total values in this month's revenue report are twice as much as last month's. Which of the following most likely occurred during the ETL process?
A Chief Executive Officer (CEO) is requesting more up-to-date sales data for improved visibility prior to month-end. An analyst must determine the frequency of a sales report that was previously distributed on an as-needed basis. Which of the following would be the most appropriate frequency for this report?
Standardized tests are given to students in the middle of each month, and the results are ready by the end of the month. The superintendent needs a quick view of test performance. Which of the following would be the best recommendation to meet the superintendent's requirements?
A financial institution is reporting on sales performance to a company at the account level. Due to the sensitive nature of the government the does il with, some account information is not shown. Which of the following fields should be masked?
A recurring event is being stored in two databases that are housed in different geographical locations. A data analyst notices the event is being logged three hours earlier in one database than in the other database. Which of the following is the MOST likely cause of the issue?
Which of the following is a common data analytics tool that is also used as an interpreted, high-level, general-purpose programming language?