Copyright © Data Valley 2018

  • fb-01
  • linkedin-01
  • wechat-01
  • email-01
 
 
ABOUT
What is BigDatathon?

RADICA works with different universities each time to present BigDatathon - a competition focusing on the application of big data, data analytics, and data science. The competition gathers real business challenges and data from actual companies for you to solve

We welcome any students, graduates, startups, working professionals to join the challenges.
Each challenge will be assigned a dedicated mentor to help you understand the problem and provide guidance during the Datathon prototyping stage.

Hurry and gather your team of 2-5 people and register! 
*Please note that each team must choose ONE challenge from the list, and cannot propose their own topics. The same topic can be worked by more than one team. 

Participants own the rights to their projects (i.e. source codes, Excel files, and other deliverables) they create during the Big Datathon. The copyright, intellectual property and other rights of all datasets as provided in the Datathon are solely owned by the data providers and topic sponsors. Participants are required to sign a non-disclosure agreement for the data provided during the competition.

 
Previous WINNERS (2017)
Your time to shine 

​Awards

Team Name

Champion 
Best Presentation Award

Team Banana

First runner-up
Best Enterpreneur Award

RestaurantFlow

Second runner-up

Fashism

Radica Best Data Hunter Awards

Team 1: The Big Dee
Team 2: Pinky Five

Most Innovative Idea Award

Chocobrick

 
 
Previous TOPICS
(2017 DATATHON)
2018 Topics to be release very soon! Stay tune!

Company

Topic

DFS Group Limited is the world's leading luxury retailer catering to the travelling public. 
 

Challenge 1 
 

There are customers of different membership tiers who visit DFS's branches in Hong Kong every day. Some of them are the first-time buyers while some of them have repeated purchase. Based on the datasets provided from Jan 2015 - Oct 2016 (22 months), please: 

  1. Predict the sales trend as break-down by i) branches & ii) membership tier for the period of Nov 2016 - Mar 2017 (5 months) (Hidden from participants)

  2. Find out if there is any correlation between the list of 50 mentioned brands by the 2 KOLs i) Mr Bags; ii) Gogoboi in Xiaohongshu (小紅書) & Weibo and the sales figures of the brands. Moreover, please explain how you will handle the leap time issue.

  3. How to further optimize the business value based on your findings


Note: Data access method will be disclosed during the competition.

Godiva Chocolatier is a manufacturer of premium fine chocolates and related products founded in Belgium in 1926.

 

Challenge 2 

People usually buy chocolates on different occasions for different purposes. Chocolate will be sold in different categories such as gift boxes, biscuits, ice-cream, coffee, etc. Based on the datasets provided from Jan 2015 to Oct 2016 (22 months), please: 
 

  1. Predict the chocolate consumption trend as break down by i) product category; ii) in terms of sales figures and no. of unit sold for the period of Nov 2016 - Mar 2017 (5 months) (Hidden from participants)

  2. Find out if there is any correlation between chocolate consumption trend i) among different product sub-categories; ii) and any of 1 external factor other than festival factors (e.g. temperature, football game final, etc.) Note that you need to crawl the data for the external factor by yourself.

  3. How to further optimize the business value based on your findings


Note: Data access method will be disclosed during the competition.

  RADICA is a leading big data marketing solution provider with headquarters at Hong Kong, and established at the Hong Kong University of Science and Technology in 2000. 

Challenge 3 

Hong Kong people most likely share their comments of restaurants in different social media instead of writing on Openrice. Please suggest a method for ranking the top 50 restaurants (with as much information as possible) and identify the absolute rankings rationally. 

Please also suggest a method how you can screen out those fake data or comments in your ranking. The Top 50 restaurants can be the Top 10 of each of Chinese / Western/ Japanese/ Indian/ Thai restaurants. 

At the end of the competition, please submit:

  1. Your programming source code

  2. The list of Top 50 restaurants (Top 1 at the top) with all information crawled in Excel or CSV. All the features should be distinctive

  3. Participant should submit 1 screen capture of each data source and clearly identify the features in your result.

 

 

Challenge 4 

Please find out the top 50 hottest fashion items 
among these 500 items and present as much as information about them (e.g. price, news, comments, etc.) in a good data visualization approach. 

Please also suggest a method as to how you can screen out those fake data or comments in your ranking. The Top 50 hottest fashion items can be Top 10 each of female's fashion/male's fashion/kids' fashion/ luxury bags and accessories. 

At the end of the competition, please submit:

  1. Your programming source code

  2. The list of Top 50 fashion items (Top 1 at the top) with all information crawled in Excel or CSV. All the features should be distinctive.

  3. Participant should submit 1 screen capture of each data source and clearly identify the features in your result.



The Data Studio, located in Hong Kong Science Park, is a new data-center with a mission to encourage and stimulate the development of solutions to generate economic and social value from open data and big data. 

We encourage participants to use Data Studio @Science Park and /or crawling open data from the Internet.