1

Before I came to this course, I assumed it would be about various algorithms for data analysis. But once I got here, I realized it felt more like a philosophy class than a science class. Unlike traditional STEM courses, what I took away was largely conceptual understanding. That had never happened to me in any class I’d taken before. For example, when we were asked to pick a “V” or some letter to explain BDA, I found myself thinking, “Isn’t this basically an English literature class?” The most striking thing I learned from that assignment was that, as a perspective in business, BDA does not treat process as the most important thing. Everyone cares about outcomes, without results, anything you do is useless. That already brings philosophical thinking into the picture. At the time, my view was that if the cost of doing the research exceeds the benefits, then there is no need to collect and analyze the data, because data by itself is valueless. Rather than the fundamentals of BDA, I felt I learned more in terms of philosophical ways of thinking, which was quite fascinating. It introduced a theoretical tension between traditional academical stuff and commercial thinking. Of course, after a few months of study, my understanding of BDA itself also grew. (e.g., Its massive)

2

If I manage to finish this course, I would like to say the Final Project. But since I haven’t completed it yet, I expect it will be the BDA vs. Stats poster. Although I’ve taken AP Statistics, my understanding of BDA is very limited. Everything I knew about BDA came from searching online. Even so, through this project I did gain some awareness of BDA in terms of validation and operations.

3

Because my time in the course has been so short, I don’t have much else to write. Setting aside the usual content, I plan to use my personal experience with Obsidian to help Longlong with their Obsidian project in the future. To be practical, as one of the few people in the class who has studied both statistics and calculus, the help I could offer was still quite limited. The only concrete things I can point to are perhaps the articles I posted on Medium or the BDA vs. Stats poster I made. Active participation in class was probably my biggest contribution.

4. Perceptive

Because I had studied AP Statistics before, I usually approach research using theories learned in that course (e.g., My last year AP Research project about reducing the crowdedness and wait time). The classic statistical procedure is to study a sample obtained through random or stratified sampling or so on, and then infer the population via inferential statistics which often using confidence intervals. After learning about BDA, however, I realized that when the data are sufficiently comprehensive, sampling may be unnecessary. Although the specific methods are still to be learned, it has given me a new way of thinking. Of course, within a BDA mindset I still consider sampling biases familiar from statistics. For instance, when some factor related to the research target affects the whole, sampling becomes inaccurate. The same kinds of issues arise in BDA.

I can’t think of much else, but for me personally the greatest value of BDA may be in helping my game development. A year ago, I started making a rhythm game. It isn’t finished yet, but the core gameplay and other elements were determined long ago. I could try collecting players’ level-play data such as level difficulty (rating), scores, and number of plays, and then adjust updates later on based on these data. In theory, the number of plays across difficulty levels should be normally distributed, because we need to ensure that not everyone can get high scores on the boss levels. If most people are getting high scores on high-difficulty levels, that suggests we need to release even much harder levels. If one level’s play count is significantly higher, we can infer that it’s engaging and of high quality (in theory, hard levels attract more attempts, but if the quality is not optimal, players will dislike them). Data collection may violate App Store policies and usually requires prior disclosure. Even so, this gives me an excellent way of thinking that I had never considered before. If I have the opportunity after I finish my college applications, I’ll try to incorporate it into my game.

In theory, BDA should have many more uses to me, but I haven’t discovered them yet or haven’t had the chance. My intended major is computer science, and given that computer / network is the base platform, it’s easy for data collection to end up with n = all situations. So BDA should, in principle, be of significant help to my future field. And, similar to the case I mentioned above of collecting data in a game, because the collection would be mandatory, sampling bias wouldn’t appear.