-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathfundamentals-1
70 lines (64 loc) · 26.1 KB
/
fundamentals-1
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
Concepts of Business Intelligence, Data Warehouse, Data Mart
In this place we are presently busy basic concepts about business intelligence in the ways to analyze it in the system Suport and the system of the decision support.
Goals: This English the difference between system of support operation support in the system of the decision support the concept Of Dataware House/ Data Mart in the concept how this concert origins.
It's before we start the study about data warehouse we need to know a few are listening before business intelligence or BI the BI are respect On intelligent to applicate on business. What does this mean? The increase of interprise reveals challenges. Related to acknowledgement. On the On business. And mood it could influence directly or undirectly the healthy From the enterprise from manager from organization take the decisions more likely directly focus on text to be better from activities Of development. In minimizing risk Which could impact the result.
In the way the platform of business intelligence supports the cricket to acknowledge to take decisions Hey unavailable information which will give you support to the seniors This group of pets results one environment and like environment with information to manager in Which happened and we'll happen enterprise it will be happen on future.
Example 1: tosome supermarket manager realixxe the analysis of was happend and identify what is the products sold at summer, is necessary analyse the last years on months with product or more sale From the December at March. If the same manager Compliment the stay off product to the stock Hey for sale finish of example Hey will happen could reveal in future So it's possible one like the project buy it by Customers and do the profile of consuming products adaptable to this profile my pet The product wire it is possible for a interest some products related for best buyers type of analyzer diagnosis discreet Prescriptive preset
According to group Gartner Group, Gartner, 2020 Below.
diagnostic analysis: Exams that from past how this is happen? Character requests about some projects Product sales on summer for example Superman
descriptive analysis: Exam the data to reply questions about what is happen or what's going on for example this is the Weekly on sale past.
predictive analysis: using techniques from mineration data and based on data in past to reply some questions about what will happen?
Prescriptive analysis: determine action What you gonna do to save some product from sector in personal hygiene using graphic and Simulation precision and complex event. Neutral networks and machine learning
The form of data analysis is linked to the organization's objectives, the objective is to visualize the relevant data to facilitate decision making. Now let's get to know what a decision support system is and go deeper:
Data warehouse according to Laudon (2014) two thousand and fourteen the objective of a GIS management information system is an organization to achieve operational excellence, develop new products, services and business models, strengthen relationships with customers and suppliers, improve decision-making. competitive advantage and survival the management information system makes reports available to users at levels, levels, at the level of managers who have more specific goals. Decision support systems, on the other hand, are based on knowledge that support decision-making in organizations, with analysis tools and vision from different analysis perspectives, they process large volumes of data, consolidate and provide analytical environments with queries in the format of reports and dashboards
There is also an executive information system for decision-making by company executives. Your reviews are more summarized and the review interface is easier to use and straightforward. The three types of these management information themes are intended to support decision making. Each one aimed at a specific audience. The dataware House is a management information system focused on supporting decision-making that is normally carried out by the organization's managers. The dataware House or data storage concept emerged between the nineteen eighty-nineteen nineteen-ninety years from the work developed by researchers Berry Beverly and Paul Murphy named business data warehouse that integrate data to support analytics on an organization's data.
Although Bill Inmon already used the term dataware House in the nineteen seventies or article excited the problem to be solved and the solution to be implemented for enterprise data integration. Chemist later spread the concept of data where house and today is known as the father of the Data Wear House of W. Professor Ralf Kimble is also a reference in the concept of DW date. And it developed an implementation approach different from the approach presented at the very least.
The top-down approach is part of a framework that broadly covers the issues contained in an organization. of W e from this vision today date death DN are drawn the DM will be detailed in the next topic the approach of kimble bo Up. Boton Up is dedicated to creating smaller visions with the data martes. DM. And then integrate these modules resulting in DW. Organization Figure one presents the approaches defended by the two authors.
Top down approach of DW car rental company rent DM or sell DN. // Button up approach Quimbal DW car rental, rent DM, sell or sell DM.
The choice of approach to be implemented by an organization depends on its need for analysis. However, often the button up approach. It is chosen because it is easier to implement. Exploring one subject at a time evolving with the two data mars development until you get the desired data wear house in the next topic the detailed data mart concept for a better understanding of the cited approaches.
Data Marts: Mars DM. DM is a data warehouse focused on an organization's subject. It is a subset of a data warehouse. The Dataware is formed by several DMs linked by common analytics perspectives. For faster implementation of the analytics environment it can be built by Data Mars. in this case it is important to understand the data mars as part of a whole DW that will be integrated with the other subjects providing analysis for the entire organization. Note the car rental ana scenario below. In order to provide an excellent service to its customers, a car rental company maintains a portfolio of zero kilometers vehicles with up to one year of use to rent to its customers. Upon completing one year of use, vehicles are sold and new vehicles are purchased for replacement. For profits and customer loyalty by offering benefits on your rentals. The rental company wants to know which customers have rented a vehicle in the last six months at least once a month
graphs: rent DM, customers loyal to the last six months. Reference month July two thousand and twenty. Customer John rental amount three, thirty-five. Client José, number of rents, thirty-two. Mary twenty-eight.
for this, an analysis environment was built with the Datamark. Rent DM making it possible to answer the question about customers. Over time, did you feel the need to answer another question, do customers who bought cars with us participate in the loyalty program? To answer this question, the DM, sale, DM was built. DM chart, used vehicle sales, reference month July two thousand and twenty. Customer José. Sales value of the vehicle. Ninety thousand reais. Customer Ana vehicle sales value. Fifty thousand Joaquim customer. Vehicle sales value forty-five thousand. Datamarte sale DM has the same analysis perspective as Datamarte rent DM this perspective is the view of a customer with the perspective of unusual in the two DM it is possible to relate them and analyze the information of rental and sale of vehicles for customers from the rental company.
DW car rental company rent DM sale DM vehicle sales chart for loyal customers month reference July two thousand and twenty client José De rents thirty-two. Vehicle sales value ninety thousand.
From the car rental example, it is possible to verify if the data warrehouse and the DM provide managerial analyzes that facilitate and improve the performance of the organizations' activities with consistent analyzes over time. Operational support system versus decision support system. The online transition operational support themes perseend the OLTP need and meet the day-to-day needs of organizations by recording the events that occurred in each operation carried out, for example, the sales support system in the analysis scenario. Supermarket receives all occurrences of events, purchases made by customers in various physical stores and eh commerce. All operations of adding, changing and deleting records take place during the period of customer service.
Comment this system must be available so that the supermarket's operation is not harmed, that is, there can be no competition for access to those generating slowness in this environment. The analyzes performed on the operational support systems' databases are punctual and collect few records at a time.
For example, what were the products that the customer João bought today in the physical store. In case it is necessary to analyze the volume of purchases made by João in the last two years in physical stores and by Ecomerce. This will not be possible. The volume of data to be analyzed is too large to compete with the OP actions being carried out in the transactional operational support system.
How this problem can be solved? This problem can be solved by building the decision support system OLAP Online Anali cooper feeding resulting in an analytic environment where data will be available to answer questions efficiently without competing with transactional operations of the organization in a DW bar DM historical analysis is very efficient because its architecture is designed to explore large volumes of data
Main features of DW DM. DW DM is subject-oriented. Because data is integrated. It is non-volatile and analyzes over time. Unlike transactional themes that are application-oriented such as inventory and billing, DM is concerned with the main issues of the organization.
The process of extracting, capturing, data from various sources. It applies treatments, standardization and data integration providing queries for different analysis views
In transactional systems, data undergo the basic operations of adding, altering and deleting records. In analytic environments, when loading in VW DW DM they will not undergo updates, thus ensuring that the same query made last month and today will present the same result.
DW DM allows for analysis over time. The time view is very important in the analytics environment. Because historical data refers to a moment in time. This is the characteristic that allows evaluating, for example, what was the percentage of sales growth of products in the personal care sector in the first quarter of the year in relation to the first quarter of the previous year.
in addition to the main features, the DW DM differs from two transactional systems by presenting data consolidation, being aimed at the organization's managers who work in decision-making by accessing large amounts of lines to set up queries and it has data redundancy.
The transactional system in turn has detailed data being used mainly by users who, for example, perform customer service or touch control, access few lines per transaction and is normalized.
DW architecture. DW can be built with an integrated view of DMs. Linked by common perspectives within or by DM independently that address more specific issues. The construction of the DW DM involves some points that must be considered by the organization, such as the available infrastructure, the scope to be considered, the availability of data and trained professionals who will perform the activities related to the architecture of the environment. A DW DM construction project is composed for some important steps to consider.
1. Business understanding. Raise the requirements to recognize, to know the organization's need. It is a fundamental step in starting a DW DM project. The scope to be defined must contain the analyzes desired by the organization for the analysis perspectives and the indicators that will be analyzed. Define the grain that will be analyzed in the environment and understand how time should behave in the environment to be created.
2 Data mapping. This step checks the availability and feasibility of the data needed to build the analyses.
3. Construction of the data maneuvering area. Staging area where data is temporarily stored for processing
4. Construction of the ETL Exprect Transformer load process
5. construction of analysis
According to KIMBALl Architecture D a DW DN has four distinct components in the transactional system environment, sort of transects the ETL ETL System to the Presentation area and Isa data presentation area of BI. BI Application.
source transactions: back room (ETL system: transform from source to target conform dimensions,ation optional no user query support) front room: presententation area: dimensniona ( star schema or OLAP cube) atomic and summary dataed by business process uses conformed dimensions, design goal: ease of use, query performance, enterprise dw bus architecture // BI applications: ad hoc queries, standard reports, analytic apps, data mining and models.
ETL System Transformation and Load Extraction The ETL system is defined by KINDALL as an environment composed of a workspace data structure instantiated in a set of tasks organized into three stages extraction, transformation and load.
Extraction. Extraction is the step that collects data, identifying them, copying those needed for analysis, and storing this dataset in a temporary database. In addition to the sources of these transactional themes, other data sources can be considered as semi-structured data XML files J S O N and unstructured data these sources can complete the analysis of DWs DMs or even compose data martes based only on data extracted from unstructured data sources.
Transformation. Data transformation consists of applying treatments to clean and standardize it by bringing it into compliance. Convert numeric fields, format dates, integrate data, apply meta data to unstructured data, among others. Data transformation contributes to the improvement of transactional systems, pointing out inconsistencies that can be found in the extracted data. Due to the large volume that are handled, it is possible that each problem found, the analyst responsible for the DW DM informs the transactional system to solve this problem. There are load control mechanisms, logs that register the LOJ that record inconsistencies and that can be consulted and consulted as needed.
Charge. The loading of two data takes place after the transformation of the data. They are inserted into the final structure represented by the presentation area of the DW DM Where they are accommodated in an organized way in the multidimensional data model defined for DW DM
The data presentation area The presentation area is where data is organized in the dimensional model and made available to users and BI applications. At that point, the data is ready to use and can be consumed by the organization to support decision making.
Applications of bi applications consult the data that is organized in their presentation area. Through the bi-applications, users can develop their analysis or use ready-made reports and dashboards developed according to the users' needs.
DW DM self-service. The traditional architecture of a DW DM is under the care of bi analysts who aim to maintain a consistent and reliable data environment providing analytics to users or for bi applications and advanced users to perform analysis as needed by this flow of activities is supported by a set of tasks of understanding, requirements gathering and documentation by b analysts and these artifacts generate a metadata bank about the analytical environment with important information about the knowledge produced in that environment.
Comment although the service and performance of the BI team are efficient in terms of delivering a controlled environment assisted and supported by metadata in organizations where the demand is very large and the BI team is unable to meet the needs of users quickly the need arises for a self-service model through which the user can access, model and analyze data without the assistance of the bi-team
With this way of accessing data, users can generate their analyzes faster, obtaining the desired results with less time than the service provided by the analyst specialized in bi, however, despite the self-service model offering greater speed in the preparation of analysis pills users points of attention must be observed
1 In this model, data is decentralized and each user creates their own dataset. Applying business rules from your own point of view. Two is the development of metadata in the environment. Three, the lack of treatment and observation of data inconsistencies may present wrong results. Four analyzes on the same subject may present different results, impairing decision-making.
Data WareHouse DM metadata. The database built with the DW DM environment is an important asset for both the bi-team and the organization's users because it maintains important information about the data contained in the environment, allowing the identification of the data as a size-type name. This set of information about data is known as a data dictionary. In addition to this information, business procedures or data on business concepts and concepts are stored. Verification of applied business rules and all important information for the development of this environment. In his KiMBAll two thousand and ten he explains that the metadata is analogous to the DW BI encyclopedia and that the analyst must pay attention to populate and maintain the metadata repository Barbieri SD explains that the metadata defines the data in various perspectives
Characteristics of what is being contextualized name, weight, type, length, shape, height, distance, price, etc. Relationships. It works for maintained by its manager located in forms of treatment, formulas, calculations, manipulations, procedures, rules, mandatory data presence in that context. Quality rules required for forms, values, content, etc. And even historical information invented in discovered by disabled in etc. The main thing about working with meta data is the fact that all important information is stored and can be consulted whenever necessary.
When implementing a data warehouse, DW collects, processes and stores the most relevant data for an organization in order to support decision making. Implementation of this environment is related to the organization's need to unify data to analyze them historically in order to observe the organization's behavior over time or to map future behaviors in the business. Its implementation must be concerned with the resources available for its conception. So that the result is achieved. Furthermore, it is very important that the purpose of the construction is well defined and that it is oriented towards the needs of the organization's users.
The DW AND the discovery of knowledge in KDD database as seen before, the DW provides an organized database with several analyzes over time. This data repository offers predefined queries, sky and service analytics, and possibilities for knowledge discovery. Mor plus data mining is one of the steps of knowledge discovery in KDD database knowledge discovery in database and is related to DW that concerns data processed and available for analysis as DW can provide data for KDD processes generating value for the organization. It is noteworthy that one solution does not replace the other, they are complementary in the knowledge search process.
These techniques can reveal behavior patterns to aid decision making. In the supermarket analysis scenario the DW provides queries on the volume of purchases made by customers and the KDD processes can discover patterns in the purchases made. Example one, have you ever heard about the relationship between disposable diapers and beer? That there is no reliable source to validate this finding is a well-known fact in the bi-world and interesting to be analyzed. A large retailer in the United States of America looking at the buying patterns of its customers found that the increase in diaper sales on Fridays was related to beer sales and in the majority of sales the customers were male.
the explanation for this curious fact is that the dads go to buy diapers for their little ones and end up taking the beer for the weekend. After giving away that knowledge, the retailer strategically positioned itself alongside the beers to increase profits. Example two. Another example aimed at the well-being of patients and focused on reducing costs is the early discovery of possible high-risk surgeries performed by patients who have problems related to the spine. The study on the recurrence of appointments with orthopedists, occurrences of correlated exams and therapies dedicated to this pathology may signal future surgeries.
With this knowledge, managers responsible for the clinical follow-up of patients can offer targeted and effective treatments so that unnecessary surgeries are not performed, reducing risks to the patient and reducing hospitalization costs. In this class, we studied the concepts of business intelligence, data wear house and data mart. Now it's up to you. Let's answer a few questions about content fixation.
Activities on the business be concept that aim to provide analysis for decision-making in private or public organizations, it is possible to say that it is a system that provides reports on the data produced by the organization.
It is a system that transforms the data to build the analysis requested by the organization is one of the techniques and tools that support the creation of an analytical environment in which analyzes can be done through reports and dashboards.
It is a tool for creating dashboards with analysis that the organization may need. It's an environment that only provides analysis on the facts that are currently happening in the organization like how many products were sold this week // answer: It is a set of techniques and tools that support the creation of an analytical environment in which analysis can be done through reports and dashboards. According to the text, the concept of business intelligence supports the construction of knowledge for decision making using sets and techniques and tools that will collect data, apply the necessary treatments, integrate them, organize them and provide information that will support strategic decisions of the organization.
We can cite the following characteristics about the concept of business intelligence to be able to report on transactional data presenting transactions carried out during the day and store historical data that is accumulated over the years false extracts all data from only one data source and generates analytical views on real facts . False. Generates reports and reports with hypothetical data and extracted from various sources transforming them and integrating false. It extracts data from multiple data sources and integrates or just generating a fake analytics perspective. It integrates different data sources analytics over time with diverse analytics perspectives. Real. Because the objective of systems is to integrate data from different data sources with formats and to structure and integrate them, enabling analysis by different analysis views.
Example three through the analytical environment it is possible to perform analyses, descriptive, predictive and prescriptive diagnoses on the correct forms of analysis to state that prescriptive analyzes allow you to answer questions about what to do to achieve a certain goal. Predictive analytics enable true predictive analytics enable you to analyze the past and find answers to questions like what happened? False. Descriptive analytics allow you to evaluate events that occurred in the transactional system without proper false data handling, diagnostic analytics allow you to predict how data will behave according to facts that occurred in the false past. Diagnostic analysis and predictive analysis use neural networks for satisfactory results on historical data justification according to text descriptive analysis is used to determine actions that can be taken to make something happen.
The data werehouse is a decision support system that provides analytical environments with queries only in the format of reports that make it possible to read the facts that occurred for a limited amount of analysis, making analysis difficult. Fake, they support operational events and are formed by Datamart that deal with matters contained in a fake. It consists of Datamarks that do not allow queries for views in common analyzes because the data is not prepared to be viewed in reports and false dashboards. Large volumes of data consolidate and provide analytical environments that query in report format and that enable the reading of facts occurred by various analysis views. They are formed by Datamart that deal with matters in an organization. Real. It does not support large volumes of data and historical analyzes must be limited so that query processing does not occur with operational support systems. False. Justification Data Ware House organizes large volumes of consolidated data and providing analytical environments with standard queries in reports and such action enables the reading of facts that occurred from various analysis views. Such data are, for example, formed by date of death that deal with different matters by department in an organization.
About the characteristics of the Data Ware House, it is possible to affirm that it is subject-oriented, does not integrate non-volatile data and presents false historical data. It is subject-oriented as its integrated data changes over time and presents historical data. False. It has a departmental focus, does not integrate data and is non-volatile and presents historical data. False. It is subject oriented has built-in data is non-volatile and historical data. Real. It has a departmental focus, has integrated data, is non-volatile and presents historical data. False. Justification, the data warehouse is subject-oriented, integrates data from systems, cannot change past events, and stores historical data enabling analysis over time