Student-led presentation Never Get Lost in Random Forest is a success

@Alejandra Acedo

They say that the best way to know if you truly understand something is to successfully teach it to others.

Bachelor in Data and Business Analytics students Tharun Komari and Haozhe Huang have to agree. We decided to challenge these two students and their classmates to plan and teach various topics in the field of statistics, data analytics, or forecasting to other students, graduates, and academics in the form of brief presentations.

Here’s how it turned out.

Tell us about your presentation: Never Get Lost in Random Forest.

We applied a popular machine-learning algorithm, “random forest,” to students in the data analytics program. During the presentation, we explained a simple example of classifying patients by whether or not they had heart disease, to illustrate how to build a random forest, how to evaluate it, how to deal with missing values, and the benefits and limitations of the algorithm. Since we are currently covering the theories of random forest—not actually coding it—we aimed to explain it in simple terms, so no one would get lost.

Bachelor in Business and Data Analytics

What was the biggest challenge while working on this?

The biggest challenge was the very limited time we had to explain it to the audience. Since we wanted students to truly understand the logic of a random forest construction, we had to compress a one-hour lecture into a brief presentation. Summarizing the essence of the model without confusing people becomes a clear challenge. In the end, we chose a real-world example that most people can resonate with, so that they could follow the different steps.

What did you enjoy the most while working on it?

Our biggest joy from the project came from explaining a technical topic to an audience who was clearly interested in learning about big data. Often, in tech conferences and presentations, speakers tend to go general and vague, explaining only the trends of technology. As a result, business students are bombarded with buzzwords such as AI, blockchain, and big data, but they have no concrete ideas about the specifics of a model. On the one hand, it is always useful to receive experts’ insights about tech trends. On the other hand, one can benefit enormously by actually diving into these topics. And the reality is, this is not rocket science—if you explain it using simple language, the audience can absolutely learn something!

How does this project relate to what you’re learning in class?

Random forests, as the name suggests, are built using decision trees, which is a concept that we learned in statistics. The problem with decision trees is inaccuracy, as the model tends to have overfitting problems. Random forests address the problem by taking the mode/average of the decision tree results. Therefore, it is highly relevant to our statistics class. The process of dealing with missing values in a random forest is different from the methods we learned in class, so for us, it has definitely decreased our learning curve.

Bachelor in Data and Business Analytics Banner

What are your conclusions?

Our conclusion is that a big proportion of learning should take place outside of the classroom. When we prepare a topic by ourselves, we take an active role and an independent approach to new knowledge. You can only say you fully understand the topic if you can explain it to someone who has not heard of it before. 

As the saying goes: seek first to understand, then to be understood.

Projects like Never Get Lost in Random Forest are fundamental in ensuring that students fully understand what they’re learning in the classroom. Our students continually surprise us with their ability to understand complex issues and convey them in an interesting and accessible format to others. We love seeing them get passionate about learning, and we look forward to seeing them apply it later in their professional life!