Using Machine Learning to Diagnose Cancer – A Tutorial

Introduction

Some of you might have heard about diagnosing different health conditions with the use of artificial intelligence and machine learning. Artificial intelligence is a buzz word these days and for those who know little about programming it might actually seem real. But it’s not, at least not in 2017…

Like Kevin Kelly, I prefer to use AI as an acronym for augmented intelligence to describe learning machines.

So, what do these learning machines do and how come they are so very powerful at certain tasks? Well, let’s look at a specific example.

I’ll be using a machine learning library in Python on a cancer dataset to classify tumors as malignant or benign.

Biochemistry – Fatty Acid Metabolism [Video Series]

If you’ve been following my channel on Youtube, you know that some of the videos I make are biochemistry related. I just partially completed a series on fatty acid metabolism, which is in accordance to Lehninger’s Principles of Biochemistry textbook. It is likely that I’ll add more videos to the list in the future. But for now, here’s the ‘partially-complete’:

Genetic Mutations and Celiac Disease – My Analysis of 80 Genomes

genetic-mutations-and-celiac-disease-my-analysis-of-80-genomes

This is my third analysis of genotype and phenotype data from OpenSNP, which is a platform where people share their genetic data.

The first analysis was about smoking and the second about diabetes. I took a few genetic mutations (SNPs) associated with these conditions and looked into the genetic and phenotype data provided by the users of the platform.

Intermittent Fasting 16-8 for 8 Weeks in Resistance Trained Males – [2016 Study]

intermittent-fasting-and-8-weeks-of-resistance-training-2016-study

Researchers from universities in Italy, Brazil and the United States did a study comparing resistance trained (RT) athletes who engaged in intermittent fasting (16/8) with RT athletes who ate normally.

The experiment ran for 8 weeks and the study was published in the Journal of Translational Medicine in October, 2016. You can read it here.

My purpose with this post is to give some thoughts about this study. I also did a video review.

Genetic Mutations and Diabetes – My Analysis of 115 Genomes

genetic-mutations-and-diabetes-my-analysis-of-115-genomes

Last week I began analyzing genotype and phenotype data available through OpenSNP, a platform where people share this type of information.

The first phenotype I looked into was about smoking.

Using Python I took the smoker status reported by users and correlated it with a mutation (rs1051730) in the nicotinic acetylcholine receptor alpha 3 subunit CHRNA3 gene. A few genome wide association studies (GWAS) linked this mutation to nicotine dependence, alcohol abuse, and susceptibility of developing lung cancer.

My point with the post was to offer a proof of concept and to reveal/interpret the data I got out of my Python analysis. I wanted to create a precedent so that others could freely use and improve my scripts and my approach.

Of course, if you’re a user of OpenSNP, you can gain a lot of insight by looking at your own genotype for this SNP (single nucleotide polymorphism) and correlate it with my findings. To see the exact details of what I did and to download the Python codes, go and read the post.

Anyhow, I decided to continue with another analysis.

Analysis of 243 Genomes – My First Report [Nov. 2016]

analysis-of-243-genomes-my-first-report-nov-2016-1

About two weeks ago I learned about this website OpenSNP where people can share their genetic information and not only. It is similar to 1000genomes, but I think it is much more interesting to work with because aside of genetic information (SNP sequencing, exome, etc.) most users also share phenotype data; data is not anonymized. This is what sparked my interest.

With phenotype data and user’s genetic mutations – SNPs – (or other relevant genetic information), I could run analyses and find possible correlations. This is applied big data.

In this post, I’ll explain how I conducted my first analysis. I want to provide an outline with enough relevant details so I can have a reference point to make things easier in future analyses. Of course, I could simply do this in private but I’d rather post it on the blog so that others who are interested to run similar analyses can have starting point.

This involves: knowledge of genomics, genomics related software and raw data formats, programming, and a lot of patience.

Radiotolerance Lessons from the Tardigrades

radiotolerance-lessons-from-the-tardigrades

Image: female tardigrade containing eggs.

Hashimoto and colleagues (2016) published an article in Nature recently:

Extremotolerant tardigrade genome and improved radiotolerance of human cultured cells by tardigrade-unique protein

Tardigrades, a.k.a. water bears, are some of the most extreme organisms, capable of surviving in the most un-habitable environments and being exposed to insults that would kill other living beings. Examples include: very high and very low temperatures, high doses of radiation, high pressure, outer space, and others.

Here are some of the particularities (in terms of gene expression) of tardigrades:

The Hallmarks of Cancers #1 – Deregulating Cellular Energetics

the-hallmarks-of-cancers-1-deregulating-cellular-energetics

I wrote a moderate-length review of Hanahan and Weinberg’s papers a few months ago.

In their papers, they discuss the most common similarities among cancers and they base their writing on ~5 decades of research in this field.

While each cancer is unique, especially if we view it from a genetics standpoint, Hanahan and Weinberg discuss 8 hallmarks they found to be common in cancers.

Data from David Blaine’s 44-day Fast – [Metabolic and Physiologic]

David Blaine - Macro and micronutrient looses Study - 1

Introduction

David Blaine has subjected himself to a prolonged fasting experiment lasting between Sept. 5 and Oct. 19, 2003.

“A 30-year-old male, weight 96 kg, height 1.84 m, entered a transparent Perspex box on the banks of the river Thames in London and was suspended in the air from a crane for 44 days. During this period, he took only water to drink.” [2]

At the end of the fast, Blaine had lost 24.5 kg and ~8 BMI points (29 => 21.6). Though his BMI was not life-threatening, he was admitted to the hospital for intensive and careful refeeding, as some of his biomarkers were out of normal limits. [1]

Several research studies have been published based on Blaine’s self experiment. Let’s see some data…

10-Lecture Course on Science Based Medicine – And Alternative Practices

10-Lecture Course on Science Based Medicine - And Alternative Practices

Introduction

I started watching these lectures on Youtube about two months ago, a bit every day, slowing digesting the information and doing additional searches on the web whenever I found that a topic sparked my interest.

These lectures are presented by Dr. Harriet Hall, a retired family physician and Air Force Colonel. According to Skeptic:

“She writes about alternative medicine, pseudoscience, quackery, and critical thinking. She is a contributing editor to both Skeptic and Skeptical Inquirer, an advisor to the Quackwatch website, and an editor of Sciencebasedmedicine.org, where she writes an article every Tuesday.”

You can find more about Dr. Hall at SkepDoc.info.

I appreciate the work Dr. Hall and others are doing in informing the public about the perils of quackery. And from what I perceive from her work, she doesn’t only try to bring awareness about bad science in alternative and non-scientific practices, but also in conventional medicine.

Thus, I consider her a good example of impartiality. She takes on conventional medicine and its numerous inconsistencies in lecture 9. No matter the pitfalls of conventional medicine, I’d say that it’s better regulated and more science based (given that you can filter through poorly conducted studies – see lecture 9) than bogus medical practices and claims.

I recommend this course to people who want to become more educated in critical thinking and to those who want be less prone to being phished for phools. Here are the titles of the lectures and a few notes on each.

wordpress themes