​& supporting
contact us


​MAY 26-27

UBS provides financial advice and solutions to wealthy, institutional and corporate clients worldwide, as well as private clients in Switzerland. UBS' strategy is UBS centred on their leading global wealth management business and their premier universal bank in Switzerland, enhanced by Asset Management and the Investment Bank. 


Challenge 1 


There is a lot of content generated from social media channels such as Weibo in China every day. For marketing purpose, UBS would like to know what are the content that the netizens feel interested most in different cities of China. With the results, UBS could focus to host more relevant client outreach events based on their interests.


We now provide you over 6,000 Weibo posts (unstructured data) as extracted from 8 KOLs between 1st Feb and 30th Apr 2018. These posts should be grouped into different categories of interest first for easier analysis. You are now required to:


A. Classify these Weibo posts into 13 categories of interest according to the below category ID:


0. Stock

1. Bond

2. Oil

3. Gold

4. Real Estate

5. Chinese Art (painting/ drawing/ calligraphy)

6. Western Art (painting/ drawing/ calligraphy)

7. Jewellery

8. Artefacts

9. Golf

10. Car

11. Overseas Education

12. Young Children Education


B) Based on the social network’s likes, retweets, comments and the city of the commentators of each post, please find out the ranking of the 13 interest categories (a) within each city and (b) across the cities according to their popularity. The most popular interest should be put at first.


C) Besides, you must also demonstrate your idea how to further value-add to UBS business for a marketing solution based on the above results.


At the end of the competition, please submit: 


  1. Your programming source code

  2. Your classification result in a txt file (please download the template here)

  3. Your ranking results in a single Excel file (please download the template here)

  4. A PowerPoint file to clearly state the methodology to run the algorithm and your marketing ideas

  5. List out the External Library (if it is needed)


Note: Data access method will be disclosed during the competition.



RADICA is a leading Big Data Solution Provider that offers secured, easy-to-use and quality data-driven solutions to the early technology adopters in the new era of marketing. With headquarters at Hong Kong, RADICA helps multi-national brands and enterprises to discover any hidden opportunities by connecting data across different sources with machine-learning and big data technology, helping them further optimize their database value.



Challenge 2 


As consumer demands evolve towards digitally enabled experiences, more data is available on the Internet to the retail sector for the data analysis. Retails eagerly want to have more external data (open data) to map with their internal data, in the hope to gain further business insights and understand the trends. 


You will be provided a list of hyperlinks that represent different data sources in the Internet. All links were assigned 3 Tiers of Difficulty, subject to the difficulty level of the crawler programs (i.e. Tier 1 = Least Difficult / Tier 3 = Most Difficult). You are required to crawl as much as useful open data for the retail industry, and demonstrate your idea how those external data can be benefit to retail industry for a profitable solution in the future.


Participation Criteria:

i)      Each dataset should have at least 50 rows. If the dataset originally has less than 50 rows, please crawl all rows. 

ii)      Each dataset should have at least 2 columns.

iii)     You need to crawl at least 5 datasets within 24 hours.

iv)     All the column features should be distinctive.


* 1 of the datasets we have given it additional “UNICORN Bonus” with extremely high crawling difficulty. If you managed to hunt it, your total score will be increased by 25%!


At the end of the competition, please submit: 

  1. Your programming source code

  2. Your crawling results in a single Excel file (please download the template here)

  3. A Powerpoint file to clearly state the methodology to run each crawler and your marketing ideas

  4. List out the External Library (if it is needed)


Note: The list of hyperlinks will be disclosed during the competition.



PolyU is a government-funded tertiary institution in Hong Kong with a total student headcount of about 29,000 students, including full-time and part-time students. It is fully committed to academic excellence in a professional context with a view to designing, developing and delivering application-oriented education and training programmes. It also engages in a broad portfolio of research and scholarly activities in a focused manner, with special emphasis on applied research.


Challenge 3

Admission to university is an important milestone in most people’s life. While there exist many different forms of data related to university admission, carrying out deep analytics on such a variety of data will lead to valuable information to high school students and other stakeholders.


This challenge consists of two parts: (i) predictive analytics and (ii) descriptive analytics. In predictive analytics, your task is to predict the maximum and minimum HKDSE scores obtained by 2012-2017 admittees to three degree programmes of PolyU, namely, Computing, Business and Nursing. As the history of HKDSE is short and the publicly available data is limited, the challenge is about how to make use of auxiliary data like other related degree programmes’ data to enhance its accuracy. You will be provided with some sample data for this part and the incorporation of more auxiliary data is welcome. Your work will be beneficial for PolyU to better understand the attractiveness of different programmes.


In descriptive analytics, you are asked to crawl useful and relevant data from websites and online forums, for which a recommended list will be provided, and then mine/analyse the data to discover insights about PolyU’s programmes. While the type of analysis is open, you may consider carrying out a proper clustering of the available JUPAS degree programmes based on their admission requirements and historical application and admission data (as in predictive analytics). It is expected that those programmes within a cluster are closely related to each other in admitting quality students and universities can benefit a lot from such information for future programme design and marketing strategy.


At the end of the competition, please submit:

1.    Your programming source code

2.    Your results in predictive analytics (see this sample submission file)

3.    Your results in descriptive analytics, including the crawled data and possibly the cluster information (see the sample submission file)

4.    A document (i.e. a powerpoint presentation file) highlighting your formulation, methodology, findings and tools used.


Note: Data access method will be disclosed during the competition.