PREDICTIVE MACHINE LEARNING (ML) ALGORITHM USING IOT FRAMEWORK FOR NOVEL CORONA VIRUS (COVID-19)

During earlier months of the pandemic COVID-19 with no recommended cure or vaccine available only solution to destroy the chain is self-isolation which can be maintained by physical distancing. This is now understood that the world require much faster solution to accommodate and deal with the future COVID-19 spread over the world by non-clinical methods namely data mining, augmented intelligence and several Artificial Intelligence (AI) techniques. It has become a huge hindrance to mitigate for the healthcare industry to provide more potential involved for patient's diagnosis and also for effective prognosis of 2019-CoV pandemic. Therefore, the proposed framework is implemented with the Internet of Things (IoTs) in healthcare industry for collecting the symptom data of real-time that is beneficial in predicting whether the person gets infected with COVID-19 virus or not. This can be done through various signs namely body temperature, blood oxygen level, headache, coughing patterns, etc. Thus, the research work focused on faster identification of COVID-19 virus infection cases potentially using Machine Learning (ML) algorithm from the real-time symptom data. Moreover, the obtained results have illustrated that K-Nearest Neighbour (KNN) algorithm is highly efficient while compared with other ML algorithms such as Naive Bayes and Logistic Regression (LR) in predicting the possible recovery of the infected patients from pandemic COVID-19 with the accuracy of 96.85%.


Introduction
The development of IoT has influenced recent opportunities in several applications involved in smart cities and smart healthcare. At present, the major benefit of the IoT healthcare has been considered as real-time and remote monitoring in healthcare industry. The horrible situation like COVID-19 with controlled and managed from the world is accomplished using IoT systems without creating high restriction over people and various industries. However, the COVID-19 is a virus which causes respiratory symptoms and seems to be widely spreadable while compared to Severe Acute Respiratory Syndrome Corona virus (SARS-CoV) in 2003 [1]. There is a possibility of controlling the virus spread till viability vaccine is found out by observing social distancing [2].The implementation of good model to healthcare, surveillance and transportation may assist in low spreading diseases [3] [4]. Hence, the IoT platform combined with AI can be able to provide the given below influence during pandemic conditions [5]: 1) Introducing surveillance and image recognition method which can able to assist public security improvement. 2) Usage of drones is required for disinfection and to deliver or supply food and medicines to the patients. 3) Applications authorized with AI can able to make contact tracing which assist in creating limitation of people's access to public places.
In general, IoT used in healthcare consists of several sensors which got linked to a server that provide realtime monitoring for users or environment. During pandemic, sensors are assisted with AI that can be beneficial in predicting either the person got infected with COVID-19 virus or not. This can be done through various signs namely body temperature, blood oxygen level, headache, coughing patterns, etc. In addition, there are certain beneficial features like tracking of people's geo-locations.
Data Mining (DM) perform as an advanced AI technique utilized to discover beneficial, recent and an efficient patterns or knowledge by the available dataset [6] [7]. This technique has revealed the associations and www.psychologyandeducation.net knowledge or pattern between the dataset over single or multiple datasets [8] and even broadly utilized to the diagnosis and prognosis of several diseases inclusive of SARS-COV [9].There are enormous dataset which have been generated in the world associated with 2019-n CoV pandemic disease on a daily basis which is a precious resource for mining and analysing to accomplish an efficient, recent and beneficial pattern or knowledge extraction in making better decision to control the COVID-19 pandemic outbreak. Data Mining (DM) is widely applicable for several dissimilar applications in healthcare sector namely prediction of patient's results, modelling health results, ranking of hospital, treatment efficiency evaluation, infection control, recovery and stability [10] [11]. This research focus in developing better ML algorithms and DM models to predict the exact infected patient with 2019-nCoV. Hence, this ML algorithms and DM model aims to predict the infected patients which help the government to recover and release their people from isolation centres. In addition, this model assists the COVID-19 infected patients not to lose their lives instead of recovering from that virus infection. Thus, the model is beneficial for the health worker in determining the stability and recovery possibility of the recently infected patients of COVID-19 pandemic.

IoT backgroundand data Source for COVID-19
The usage of IoT in both communication and sensor technologies as well as computing like pervasive and ubiquitous for upgrading objects of physical into smart [12]. This enhances the smart services delivery to users for improving the quality of their lives [13]. The architecture of IoT consists of three major layers namely 1. Network 2. Physical activities 3. Application In order to collect heterogeneous data, physical objects have been equipped with sensors. However, these sensors are an inadequate computational capacity and minimum lifetime but better decision can be taken through collection of more data. Thus, the complexity in processing of data has become a bottleneck [14] whereas the connectivity is utilized for coping with less computation power for these sensors. To do research on COVID-19 disease, it is important that we get data related to COVID-19, as per our work we suggest few ways of collecting COVID-19 data from different sources. These sources can be IoT devices, Testing Laboratories, Hospitals and manually collected data. COVID-19 data is released by the government of any country for this a centralized system can be designed that communicates with other IoT devices or organisations as shown in Figure 1. Since, temperature is an important factor to detect COVID-19 disease, government can implement temperature sensor at public places like airports, railway stations, cinema halls, shopping malls etc. entrance so that whenever any person passes through that entrance, device can measure and send his/her data to server immediately, facilitating immediate action to be taken by the government. Test Laboratories can also www.psychologyandeducation.net transmit their data at the central server/system, so that government can be aware about the health of the citizen and can identify COVID-19 patients. Hospitals are also playing very crucial role in COVID-19 pandemic, they are continuously providing their services to their patients and are the main data source of COVID-19 patients because they are examining victims of COVID-19 disease.
The organization of the paper is as follow: Section 2 describes literature regarding IoT based healthcare, Section 3 describes research methodology based on proposed architecture, Section 4 describes ML algorithm for prediction of COVID-19, Section 5 describes performance evaluation based on accuracy of the classifier, and Section 6 ends with conclusion.

Literature Review
In this literature, the usage of IoT provided for the healthcare with ML is illustrated from various researchers whereas the review of benefit in IoT healthcare method has been proposed by Usak et al [15]. This paper discusses about key challenges by IoT for delivering better healthcare services and classifies the reviewed work.Wu et al. [16] has focused for improving an outdoor safety by proposing a health monitoring technique and hybrid IoT safety. This technique involves two layers namely 1. Layer 1 used for collecting user datasets. 2. Layer 2 used for accumulating the collected data in the internet.
The wearable devices are utilized for collecting data through safety indicators from the surrounded environment and also obtaining health signs from the patients.
The IoT smart health data has been authenticated for assuring security and privacy of healthcare data by Hamidi whereas the proposed technique is biometricbased authentication [17].The smart healthcare has been introduced in urban area with the help of IoT devices is the major goal for the Rath and Pattanayak [18]. This researcher focused on various issues namely security, privacy and time precision for patients in the VANET zone. Thus, the proposed technique is evaluated through NetSim and NS2.The paradigm of cloud IoT-Healthcare is integrating IoT with cloud computing in the area of healthcare is illustrated in the literature [19]. This study discusses the integration challenges and even the recent trends in the cloudIoT-Healthcare whereas these challenges have been classified by three levels are  [21]. The usage of sensors available in the smart phone for collecting the healthcare data like body temperature is proposed by Maghdid [22].

Research Methodology
This section discusses about the introduction of proposed method which addresses the pandemic COVID-19 infection diseases by IoT based framework with ML algorithm which has been utilized for monitoring and predicting possible cases of infected corona viruses in real-time. Simultaneously, this proposed framework is assisted in predicting the treatment for the confirmed cases response and even for better understanding the COVID-19 infection diseases.

Data collection about symptoms
The purpose of this research can be initiated by collecting the real-time symptom data using wearable sensors set located in the patient's body [23]. The previous study discusses the most significant COVID-19 symptoms which are identified from the real COVID-19 patient dataset. The identified symptoms are fever, fatigue, throat, cough and respiratory issue whereas these symptoms can be identified through various biosensors. Based on the instance given below are fever detection can be done through temperature based sensors [24]. The audio based sensors are aerodynamic and acoustic models have been used for detecting cough and its classification is done www.psychologyandeducation.net to dissimilar ages [25]. In addition, fatigue can be detected through motion based sensor and heart rate sensors [26]. The image based classification sensor is utilized for detecting sore throat [27]. The oxygen based sensor is used for detecting respiratory issues like shortness of breath [28].

Dataset of COVID-19
The COVID-19 dataset has been obtained from KCDC available in the Kaggle Website [29].This dataset consists of 3254 records with 8 attributes inclusive of patient ID, age, gender, oxygen saturation, sore throat, cough severity and shortness of breath. The healthcare industry is proposed with IoT and BSN technology which involves several sensors namely Blood Pressure sensor, temperature sensor and pulse rate sensor. However, these sensors can able to sense the parameters and the data is send to the controller. Based on the conditions, the buzzers have been placed in exceed of the given range which transmits the sensed data to the LCD display for viewing it. Simultaneously, the data is sent to doctors through an internet which assist in providing faster and exact solution in real-time. Hence, the dataset is cleaned and prepared based on the relevant attributes are extracted from original dataset that consist of documented symptoms data have been collected which resulted a 20 symptoms list. Thus, the major symptoms are reduced to 5 which resulted after pre-processing dataset with the records of 1476 × 7 data. From the available dataset, there are 854 records with confirmed COVID-19 cases and 622 records with non-confirmed cases as the sample for modelling.

Proposed Framework
This proposed system is segregated into two different models with sample dataset namely

Train dataset
1. Collecting the data from online data source such as real-time and recorded patient data. 2. Implementation of DMmethods such as data acquisition, data cleaning, data pre-processing, data conversion, data modelling and outlier detection.
3. Once all the DM process is completed then the data gets saved into the online database is said to be background knowledge and it can be used for testing.

Test dataset
1. In general, IoT based environment has been created by 6 sensors as minimum wearable devices.

Once the sensors are connected to Raspberry
Pi, data is collected by batch processing method accomplish from these respective sensors. However, there are various sensors have been provided generally using Raspberry Pi for collecting enormous healthy raw data. Once the complete data get generated is collected through sensor has been send to data mining techniques such as data acquisition, data cleansing, data preprocessing and outlier detecting. After this process gets completed then the dataset is made to store using cloud DB that assist in securing cloud storage data of COVID-19 datasets. Hence, the application is developed by GUI based and python webpage whereas the application usage can able to interpret the patients related to their symptoms of COVID-19 infections virus. Finally, the proposed method may assist in predicting the patients is infected by COVID-19 virus or not and also recommended for further treatments based on their records is shown in Figure 2. www.psychologyandeducation.net Figure.2 Block diagram of proposed architecture The initial process of this proposed method begins with locating the one end of IoT sensors to the patient's body and other gets connected to Rasberry Pi. Once the connection process is completed then the data is acquired and stored to the secured cloud storage by the Raspberry Pi B+. These biometric data can be monitored through LCD monitor whereas the data value monitored through sensor gets exceeded from the normal range then the alarm is made triggered. Moreover, the GSM has been utilized for sending these values to the cloud server and the latest values are made displayed over the web pages. Hence, the doctor can able to access those recent patients and earlier patient's records through login credentials provided from healthcare authorization service in order to suggest medicines through online. Simultaneously, the patients can able to see their earlier records through their login credentials that have been authorized from healthcare authorization service. Thus, the proposed method design is further distributed into two components namely Hardware and software.

Hardware components 1) Temperature sensor (LM35):
The patient's temperature can be measured using LM35 sensors whereas this sensor is placed in series which assist in integrating circuit of temperature sensors accurately. The output voltage of LM35 sensor is linearly proportional to the Centigrade temperature which has the ability in measuring temperature more accurate than thermostats. This doesn't experience oxidation and even an output voltage is not required for amplification.

2) ECG sensor:
In the case of ECG, the signals are made to be picked from chest using electrode sticks whereas the wires are connected to AD8232 sensor. This sensor assist in measuring the heart rhythm and this sensor is coast-effective. The AD8232 optimised ampere is introduced for removing the noise from ECG signals which assist in capturing PQ and PR signals quickly and clearly.

3) Heart Rate sensor:
This sensor is utilized for accomplishing digital output from the heartbeat which needs to locate the probe on it for getting the production in Beats per Minute (BPM) rate.

4) Raspberry Pi:
The cost efficient microcontroller which assist in reading the analogy sensor whereas the Raspberry Pi B+ is efficient in performance with less cost and small-sized computer system which can able to connects LCD or TV monitor and can be operated with keyboard and mouse. Generally, the used Raspberry Pi Model B+ consists of dual-core ARM11 processor with 512MB SDRAM and powers through 5V Micro USB socket. Therefore, the sensors are made to be connected with the Raspberry Pi Model B+ which sends the value to servers through the module of GSM.

Software Components
This is the major components where the ML algorithms are used to predict the COVID-19 virus infected patients. The next section deals with the discussion of analysing the datasets through various ML algorithms using the collected data from the IoT sensors.

Machine Learning Algorithms for Covid-19 Prediction
KNN is classification technique that does not require any training data for model generation and all training data is used in testing phase of model. This makes testing phase slower and costlier and training phase www.psychologyandeducation.net faster. KNN is also considered as lazy learning algorithm and non-parametric because it need not of training data and non-parametric because it does not require any assumption for data distribution. KNN algorithm works well for classification of new data into the previously defined classes. It gets the new data from the patients of COVID-19 which identify the distance of it from different classes and even it finds close to the new data that sends data to the nearest class. The new data (Green Point) is come to classifier and classifier calculates the distance of new data from each class by using distance formula.
Distance formulae are.
Minkowski distance is generalized for of all other three mention distances, for Euclidian distance p=2, for Manhattan Distance p=1 and for chebyshew distance p =∞.
After calculating distance, all distances are compared and which one comes lower, the new data send to that class.

Algorithm for KNN
Step: 1 Load dataset Step: 2. Divide data in Training and Testing Set Step: 3. Take a value of K Step: 4. Choose Distance function Step: 5. Select new data from the Testing set and find its distance for n training sample Step: 6. Arrange all distances and select K nearest sample Step: 7. Assign new data to that class which get more votes.

Performance Evaluation of Model
These research focuses on analysing predictive task in identifying the infected patients of COVID-19 with the symptoms through data mining processing and ML algorithms by Python programming language. However, the python is familiar in general purpose as well as dynamic programming language which have been utilized forvarious fields namelyDM, ML, and IoT. This implementation illustrated the IoT usage healthcare with assist of ML to predict the accurate infected patients of COVID-19 virus using python for special purpose libraries. The models are test through confusion matrix of proposed IoT framework in which proposed models are evaluated using ML techniques to determine their accuracy. The ML techniques are used to determine the performance efficiency and quality of the model using the IoT dataset. These main performance evaluation techniques for the ML algorithm model include specificity, sensitivity, and accuracy. However, this study focused only on accuracy that considered for evaluating the proposed models. Hence, the accuracy is determined in term of percentage of the dataset instances appropriately classified to the model developed by the ML algorithm is expressed as: Where, TP is the True Positive, TN is the true negative, FP is the false positive while FN is the false negative.  figure.3 has shown the results of the performance evaluation of the proposed architecture with ML algorithm accuracy

Conclusion
The proposed architecture is utilized to implement potential COVID-19 case data and healthcare records of confirmed COVID-19 cases to develop a ML algorithm in predicting the model for exact identification of COVID-19 pandemic disease. This framework also interconnected these results to healthcare physicians, who can formerly respond as quickly as possible to suspected cases identified by the predictive model by following up with any further clinical investigation needed to confirm the case. It helps in quick review of the confirmed cases that need to be isolated and given appropriate health care. In the present study, ML algorithms with IoT have integrated in developing a prediction model for the COVID-19 infected patients' recovery using real-time symptom dataset of COVID-19 patients of India. However, the three best algorithms are providing effective and accurate identification of potential cases of COVID-19.After testing the dataset on K-Nearest Neighbour, LR and Naïve Bayes Classifier we see that accuracy provided by KNN classifier is much better than LR and NB. Therefore, the KNN classifier can able to predict the newly admitted patient under CORONA Positive or Negative accurately. Thus, the accuracy of KNN algorithm is obtained as 96.85 %.