Module 4: Loading, Shaping and Merging Data
Abstraction, the process of shaping and moulding raw data to enhance relevance prior to it being presented to machine learning algorithms, is the cornerstone of the methodologies put forward in these procedures.
The procedures that follow set out the means to load data into R, and when this data resides in R, sets forth procedures to shape and mould the data in as part of abstraction.
Most generally in Jube procedures and methodology Abstraction is offloaded to Relational Database Management platforms, the shaping and moulding of data in R tends to be to augment these core datasets.
Table of contents
- Slides
- Procedure 1: Using Numeric Functions to create a Horizontal Abstraction
- Procedure 2: Extracting a substring from a string, testing logically and presenting for machine learning
- Procedure 3: Searching with Regular Expressions
- Procedure 4: Create a Date with a specific Date and Time format
- Procedure 5: Perform Date Arithmetic
- Procedure 6: Extract Reporting Periods from a Date
- Procedure 7: Importing a CSV file with R Studio
- Procedure 8: Importing a pipe separated file
- Procedure 9: Connect to an SQL Server Database
- Procedure 10: Fetch an entire table from an SQL Server Database
- Procedure 11: Sorting a Data Frame with the arrange() function
- Procedure 12: Specifying columns of a Data Frame to return
- Procedure 13: Adding Vectors or Factors to an existing Data Frame
- Procedure 14: Merging a Data Frame
- Procedure 15: Delete a Vector from a Data Frame
- Procedure 16: Exporting a csv file