Machine learning in precision medicine of cancer treatment

Differentiation between "driver" and "passenger" mutation as an important task for the functionalization of cancer genomics in the treatment of patients. Differences between mutations that lead to the growth of a mutational nucleus from benign mutations.

Рубрика Медицина
Вид статья
Язык английский
Дата добавления 19.02.2021
Размер файла 16,3 K

Отправить свою хорошую работу в базу знаний просто. Используйте форму, расположенную ниже

Студенты, аспиранты, молодые ученые, использующие базу знаний в своей учебе и работе, будут вам очень благодарны.

Размещено на http://www.allbest.ru/

Faculty computer science and software engineering, international information technology university

Machine learning in precision medicine of cancer treatment

Kokenova U.K.

Almaty, republic of Kazakhstan

Abstract

Cancer is genetic illness induced by bodily mutations in the genomes. The distinction among “driver” and “passenger” mutation is a important task for the functionalization of cancer genomics in the treatment of patients. the assignment is to distinguish the mutations that conduct to growth of mutational nucleus (drivers) from the benign (passenger) mutations. Typically, the search for definitions and analysis of these pathogenic cells is done manually. We need to automate the analysis of text and machine learning algorithms. Also we should compare different methods to find the best one. The minimum value for logg-loss is obtained when applying k nearest neighbour algorithm. Keywords: naive Bayes, k-nearest neighbour (kNN), logistic regression (LR), linear support vectors (linear SVM), random forest. mutation genomics benign cancer

Аннотация

МАШИННОЕ ОБУЧЕНИЕ В ТОЧНОЙ МЕДИЦИНЕ ЛЕЧЕНИЯ РАКА

Кокенова У.К. (Республика Казахстан)

Кокенова Улжан Кажимукановна - магистр, факультет вычислительной техники и программного обеспечения, Международный университет информационных технологий, г. Алматы, Республика Казахстан

Рак - это генетическое заболевание, вызванное телесными мутациями в геномах. Различие между «водительской» и «пассажирской» мутацией является важной задачей для функционализации геномики рака при лечении пациентов. Задача состоит в том, чтобы отличить мутации, которые ведут к росту мутационного ядра (драйверов) от доброкачественных (пассажирских) мутаций. Как правило, поиск определений и анализ этих патогенных клеток производится вручную. Нам нужно автоматизировать анализ текста и алгоритмы машинного обучения. Также мы должны сравнить различные методы, чтобы найти лучший. Минимальное значение Logg-Loss получается при применении алгоритма kNN.

Ключевые слова: наивный Байес, метод k-ближайших соседей (kNN), логистическая регрессия (LR), линейные опорные вектора (linear SVM), случайный выбор.

Cancer is known as most frequent illness facing humanity. It is known that above than 200 types of cancer exist. Also specific molecular patterns that require unique therapeutic approaches can be described for each form. Also that illness includes complex changes in the gene [1]. In each type of cancer, architecture of recurring genetic aberrations like somatic mutations, changes in copy numbers, altered gene expression profiles and various epigenetic modifications is special. Need for improved cancer detection, care and preclusion has grown, and is closely connected with a greater understanding of tumor genetic modifications.

Overview. In the last years was a lot of discussion at how precision medicine and, more specifically, how genetic testing could affect the treatment of diseases like cancer. Yet its merely happens fractionally due to of the enormous quantum of manual work that is still needed. Sequencing is a technique for estimating the casualties in the DNA that triggers the body's inherited disorders, inherited predispositions or characteristics [2].If a cancer sequenced it can be able to have a lot of mutations. However the assignment is to distinguish the mutations that conduct to growth of mutational nucleus (drivers) from the benign (passenger) mutations. Currently analysis of mutations is satisfied manually. The molecular pathologist chooses a list of important genetic variants which he / she would like to study. In the medical literature the molecular pathologist searches for evidence that is perhaps linked to important genetic variations. We should to change stage with a machine learning. The molecular pathologist would also have to determine which variants are of concern, as well as collect relevant data.

Hence the task is to create an algorithm which can be able to automatically classify using an expert-annotated knowledge base a different genetic variants as a benchmark.

Goal. The aim is to form a machine learning algorithmic program that may mechanically determine genetic variants mistreatment the knowledge domain as a basis. An associate algorithmic program for classifying genetic mutations should be established supported clinical information (text). It might be key to finding more practical therapies for cancer. Biologists could use the list of mutations to develop a private medical care that targets cancer cells of the patient. it's not an easy task, however even for consultants, understanding clinical information is extremely tough. Clinical information modeling (text) would thus be crucial to the success of our strategy. Our essential objective of classifying clinically unjust genetic mutations is thus to cut back and eliminate the manual efforts created by the clinical medical specialist in shaping and cross-referencing the categories of genes from the literature of science.

Usually problems of this types calls precision medicine. A precision medicine or an individual medicine cures diseases by carefully taking into account the genetic characteristics, lifestyle and environment of each patient [4].

We will come up with algorithms during this competition to determine the genetic variants supported by clinical knowledge (text). Genetic variations can also be divided into 9 separate groups. Modeling of scientific facts (text) will be critical to the effectiveness of the strategy.

The Sloan Charles Kettering Memorial Center collects an elite dataset. This particular set of knowledge was elite as a result of this set of knowledge was developed precisely for this reason. This problem was initiated by the Memorial Cancer Center. Sloan Charles Kettering (MSKCC), approved by the NIPS Calling Council 2017 [3].

When constructing a model for a medical problem mistake, it is important to decrease the mistake variable and respond to the forecast in terms of likelihood as specified in the problem definition to eliminate uncertainty and to achieve the most probable type. When forecasting the patient's class, we can also add the explanation for explaining the reasons for our prediction to the Class outcome. Therefore, an apparent part of this forecast will be interpretability.

To do this, it was agreed to use some of the easily interpretable algorithms like Naive Bayes, Logistic Regression, and Linear Support Vector Machine. It has also been attempted to get strong forecasts from less interpretable models including Random Forest and K- Nearest NeighBor. During issue analysis it was discovered that for this forecast the latency can be weakened to some degree without increasing the error rate.

First it is a classification problem because output is a discrete data. And there is exist possibility of multiple discrete output so it is multi class classification problem. When constructing a model for a medical problem mistake, it is important to decrease the mistake variable and respond to the forecast in terms of likelihood as specified in the problem definition to eliminate uncertainty and to achieve the most probable type. General result taken from first stage of algorithm usage shown in Table l.The minimum value for log-loss is obtained when applying kNN algorithm on the value points of the genome, variance, and text data. Although the logg-loss is less for the kNN method, we prefer Logistic Regression (with balancing) since kNN provides an overfitted value (1.008)

Table 1. Results taken from each type of algorithm

Model

Train log- loss

cross

validation

log-loss

miss-

classified %

test log-loss

logg loss

Naive Bayes

0.9

1.24

38.34

1.27

1.24

kNN

0.48

1

31.95

1.07

1.008

Logistic

Regression

(With

balancing)

0.63

1.13

34.2

1.11

1.13

Logistic

Regression

(Without

balancing)

0.62

1.14

33.08

1.15

1.14

linear

Support

Vector

Machine

0.75

1.13

35.15

1.15

1.13

Random

Forest

Classifier

(One-hot

encoding)

0.68

1.15

42.66

1.14

1.15

Random Forest Classifier (Response encoding )

0.05

1.23

41.72

1.31

1.23

References / Список литературы

1. Tomczak K., Czerwinska P. & Wiznerowicz M, 2015. The Cancer Genome Atlas (TCGA): an immeasurable source of knowledge. Contemporary oncology. 19 (1A). A68.

2. Pareek C.S., Smoczynski R. & Tretyn A., 2011. Sequencing technologies and genome sequencing. Journal of applied genetics. 52 (4), 413-435.

3. Wang L., 2017. Proposal of Kaggle's Personalized Medicine Competition, viewed 11

4. Lyu B., Haque A., 2018. Deep learning based tumor type classification using gene expression data.

Размещено на Allbest.ru

...

Подобные документы

  • Risk Factors. The following symptoms may indicate advanced disease. A barium contrast study of the small intestine. Surgical removal is the primary treatment for cancer of the small intestine. The association of small bowel cancer with underlying.

    презентация [4,1 M], добавлен 28.04.2014

  • Agranulocytosis - pathologic condition, which is characterized by a greatly decreased number of circulating neutrophils. Epidemiology and pathophysiology of this disease. Hereditary disease due to genetic mutations. Signs and symptoms, treatment.

    презентация [1,8 M], добавлен 25.02.2014

  • Control the doctors’ prescriptions. Microchip in Blood Pressure Pills Nags Patients Who Skip Meds. Microchip implants linked to cancer in animal. Microchip Implants, Mind Control, and Cybernetics. Some about VeriChip. TI microchip technology in medicine.

    курсовая работа [732,8 K], добавлен 12.01.2012

  • Etiology and pathogenesis, types, treatment of pulpits. Inflammation of dental pulp. An infection (microorganisms) which penetrats in the cavity of pulp chamber. Test of healthy pulp. Tapping of tooth directly. Root canal treatment. Tooth extraction.

    презентация [851,9 K], добавлен 31.05.2016

  • The main features of uterine fibroids. The development of a tumor from the "embryonic growth site" and a microscopic nodule without signs of cellular differentiation to a macroscopic nodule. Study of surgical and conservative treatment of leiomyoma.

    презентация [1,4 M], добавлен 31.10.2021

  • Learning about peptic ulcers, a hole in the gut lining of the stomach, duodenum or esophagus. Symptoms of a peptic ulcer. Modified classification of gastroduodenal ulcers. Macroscopic and microscopic appearance. Differential diagnosis and treatment.

    презентация [1,2 M], добавлен 22.04.2014

  • Causes of ischemic stroke. Assessment of individual risk for cardiovascular disease in humans. The development in patients of hypertension and coronary heart disease. Treatment in a modern hospital disorders biomarkers of coagulation and fibrinolysis.

    статья [14,8 K], добавлен 18.04.2015

  • Acromegaly as an rare syndrome that result when the anterior pituitary gland produces excess growth hormone. Signs and symptoms, etiology and pathogenesis. The complications of acromegaly. Treatment: Hormone therapy, surgery on the pituitary gland.

    презентация [827,4 K], добавлен 28.12.2015

  • Pneumonia is an inflammatory condition of the lung—affecting primarily the microscopic air sacs known as alveoli. The bacterium Streptococcus pneumoniae is a common cause of pneumonia. Symptoms, diagnostics, treatment and prevention of this disease.

    презентация [279,8 K], добавлен 12.11.2013

  • Concept and characteristics of focal pneumonia, her clinical picture and background. The approaches to the diagnosis and treatment of this disease, used drugs and techniques. Recent advances in the study of focal pneumonia. The forecast for recovery.

    презентация [1,5 M], добавлен 10.11.2015

  • Principles and types of screening. Medical equipment used in screening. identify The possible presence of an as-yet-undiagnosed disease in individuals without signs or symptoms. Facilities for diagnosis and treatment. Common screening programmes.

    презентация [921,2 K], добавлен 21.02.2016

  • The major pathogens and symptoms of cholera - an acute intestinal anthroponotic infection caused by bacteria of the species Vibrio cholerae. Methods of diagnosis and clinical features of disease. Traditional methods of treatment and prevention of disease.

    презентация [1,0 M], добавлен 22.09.2014

  • Gastroesophageal reflux disease. Factors contributing to its the development. Esophageal symptoms of GERD. Aim of treatment. Change the life style. A basic medical treatment for GERD includes the use of prokinetic drugs with antisecretory agents.

    презентация [390,7 K], добавлен 27.03.2016

  • Infectious hepatitis - a widespread acute contagious disease. Botkin’s Disease is a viral disease that destroys the liver and bile ducts. Anatomy of the liver. The value of the liver to the body. Causes and signs of the disease. Treatment and prevention.

    презентация [4,0 M], добавлен 24.04.2014

  • Tachycardia is a heart rate that exceeds the normal range. Symptoms and treatment methods of tachycardia. An electrocardiogram (ECG) is used to classify the type of tachycardia. It's important to get a prompt, accurate diagnosis and appropriate care.

    презентация [596,2 K], добавлен 20.11.2014

  • The pathological process Acute Respiratory Distress Syndrome (ARDS). Specific challenges in mechanical ventilation of patients with ARDS. Causes of ARDS, and differential diagnosis. Treatment strategies and evidence behind them. Most common causes ARDS.

    презентация [2,6 M], добавлен 21.05.2015

  • Body Water Compartments. The main general physico-chemical laws. Disorders of water and electrolyte balance. Methods bodies of water in the body, and clinical manifestations. Planning and implementation of treatment fluid and electrolyte disorders.

    презентация [1,1 M], добавлен 11.09.2014

  • История развития заболевания (анамнез) и жизни. Анализ состояния всех систем организма больной. Предварительный клинический диагноз и его обоснование: cancer сигмовидной кишки. Данные дополнительных методов исследования. Сопутствующие заболевания.

    история болезни [17,1 K], добавлен 03.03.2009

  • The etiology of bronchitis is an inflammation or swelling of the bronchial tubes (bronchi), the air passages between the nose and the lungs. Signs and symptoms for both acute and chronic bronchitis. Tests and diagnosis, treatment and prevention disease.

    презентация [1,8 M], добавлен 18.11.2015

  • Disease of the calcified tissues of the teeth. Demineralization of the mineral portion of enamel and dentine followed by disintegration of their organic material. Classification of caries. Prevention and treatment of caries. The composition of the pulp.

    презентация [424,6 K], добавлен 14.12.2016

Работы в архивах красиво оформлены согласно требованиям ВУЗов и содержат рисунки, диаграммы, формулы и т.д.
PPT, PPTX и PDF-файлы представлены только в архивах.
Рекомендуем скачать работу.