SRID/AGRA Data Cleaning Workshop

R is a "programming language that provides a user with powerful data cleaning and graphical analysis capabilities," according to its description.
It was also revealed that one of R's most powerful features is that it is open-source, which means that anybody may access the underlying code that runs the software and add their own code for statistical computing or graphics for free.
R, on the other hand, contains a collection of tools that are expressly designed to clean data in an efficient and thorough manner. Participants have gone through the basic stage of installing the software and its required packages before to the training.
The participant had the option to be educated on the usage of the program as part of the first day's work, as it was new to some of the participants. They learned how to create a new R project and set up the base path in the R repository, among other things.

During the sessions, technicalities related in dealing with data correction were mentioned; importing raw data from target dataset and supporting datasets. Explicitly, loading flat files in R with the read_csv function of readr package, which is part of the core tidy verse. Likewise, how to rename column to mainly remove spaces and numbers for easy manipulation and analysis was also part of the day’s discussion. Participants were introduced to summarized document of R codes, loosely called the “cheat sheet”. The purpose of the “cheat sheet” is to serve as a guide that can be referred to for help in understanding or remembering something complex relating to R codes/language. Afterward, participants further progressed into a series of distinct approach of screening for data errors, missing values and outliers from the wholesale prices of agricultural commodities as an example dataset for practice.

As a constituent to the in-progress project titled Institutional Capacity Enhancement of Government Capacity in the Surveillance of Food Systems in Ghana by AGRA in collaboration with SRID, the SRID-ICT and Data Management Unit led a data cleaning workshop with its staff and other members from the Directorate to use the R statistical package for computing and graphics to clean data received from the Global Food Security Strategy-Zone of Influence (GFSS-ZOI) which includes seventeen (17) selected districts.
The purpose of this workshop was to lead a group of twenty (20) staffs from SRID to use R as a tool for cleaning the data received from the field.