Data Profiling this is the very first step in understanding what insights your data can provide. Profiling enables you to decide if your data is sufficiently complete and accurate to solve for the use case(s) under consideration. Given the diversity and immense volume of data now available for AI solutions, machine learning is used to profile every record, identify all cases where a feature is unique or missing, and also to determine the mean, standard deviation, median, min/max values for all data features. This in turn allows for creating distributions for every feature, enabling transformation (if necessary) to normalize the data.
One of our guiding principals at Macrosoft AI is to become expert on our client’s internal data as well as any complementary sources that are determined to be relevant for the use case(s).
Working hand-in-hand with Profiling, we apply machine learning algorithms to draw conclusions from profiled data to create categories for all relevant data.
Machine Learning-enabled Data Profiling may identify a gap between the available data and that which is required to effectively satisfy the use case(s). Remediation of any such gap is normally performed in the Data Preparation event.