Proceedings of the International Conference on Emerging Challenges: Business Transformation and Circular Economy (ICECH 2021)

Building a Geo-Demographic Segmentation Model: the Case of Hanoi City, Vietnam

Authors
Le Thu HANG1, *, Bui Nguyen Anh TUAN2, Van Duc MANH3, Nguyen Quynh CHI4, Bui Thien BINH5, Tran Ngoc DIEP6
1PhD Candidate, Faculty of Business Administration, Foreign Trade University, Hanoi, Vietnam
2PhD Candidate, Vietnam Competition Council, Ministry of Industry and Trade, Vietnam
3,4,5,6English 1 Advanced Program – K57, Faculty of Business Administration, Foreign Trade University, Hanoi, Vietnam
*Corresponding author: hanglt@ftu.edu.vn
Corresponding Author
Le Thu HANG
Available Online 7 December 2021.
DOI
10.2991/aebmr.k.211119.048How to use a DOI?
Keywords
Geo-demographic segmentation; K-means clustering; Big data; Principal Component Analysis; Location analysis
Abstract

Location is one of the most crucial factors determining enterprises’ strategies when entering, exploiting or expanding into a new market or a new area. The study illustrates how to create a detailed analytical model of the market segmentation in all districts of Hanoi using K-means clustering and principal component analysis (PCA). The model describes the demographic characteristics of each area such as age, occupation, education level, etc.; thereby giving enterprises precise sources of information about geographic location, which helps reducing the cost and time in decision-making process to enter or expand the businesses in a new area.

Research purpose:

In the rapid development of today’s society, especially the boom of information technology, our economy is becoming more complex, the market is expanding, and the competition is becoming more and more fierce. This trend requires businesses in any industry to make full use of all resources and opportunities to gain a competitive advantage in the market. One of the most important and prerequisite things to ensure the success of a business is identifying and reaching the right potential customers. More sepcfically, one of the most popular methods to reach customers is to find a geographical location that fits the needs of the business.

Because the model of market segmentation by geographic location and population is highly applicable to business activities, especially in identifying the right customers, many businesses have conducted research and came up with the geographic models that best fit their strategies. However, these studies are not widely published and cannot be applied to the activities of other businesses. Therefore, with the desire to provide an accurate and customized analytical model of the market distribution in different districts in Hanoi, our research team decided to choose the topic Building a geo-demographic segmentation model: The case of Hanoi. Based on this model, businesses in Vietnam, especially Hanoi, can seek for new potential markets and suitable areas to help expand their businesses geographically.

Research motivation:

Because the model of market segmentation by geographic location and population is highly applicable to business activities, especially in identifying the right customers, many businesses have conducted research and came up with the geographic models that best fit their strategies. However, these studies are not widely published and cannot be applied to the activities of other businesses. Therefore, with the desire to provide a general, accurate, and detailed analytical model of the market distribution in districts in Hanoi, our research team decided to choose the topic Building a geo-demographic segmentation model: The case of Hanoi. Based on this model, businesses in Vietnam, especially Hanoi, can seek for new potential markets and suitable areas to help expand their businesses geographically.

Research design, approach and method:

The order of methods used in this study is as follows:

  1. 1.

    Population data collection (CSV file) and geodatabase (JSON file)

  2. 2.

    Perform raw processing and data cleaning

  3. 3.

    Perform population data analysis using Principal Component Analysis (PCA) method

  4. 4.

    Principal components obtained through PCA analysis were used to determine the number of clusters

  5. 5.

    Perform K-means clustering from population data

  6. 6.

    Optimizing the number of clusters n from K-means clustering by Elbow method

  7. 7.

    Find the exact number of clusters n from K-means clustering using Silhouette method

  8. 8.

    Grouping wards, communes and townships into clusters

  9. 9.

    Link the population data table to the spatial points (polygons) in the geodatabase

  10. 10.

    Perform geographic segmentation mapping

Main findings:

In this study, the geographic segmentation model is applied to 582 commune-level administrative units of 30 district-level administrative units in Hanoi city.

The study tries to identify the common characteristics of each group based on the results of the component matrix generated from the commune-level administrative units. Since there is no correlation between the components (clusters) formed, the properties of each component can be interpreted and determined independently of the other components (clusters). The four main components (04 clusters) obtained are dependent variables and the descriptive data listed in the research methods section used to explain these clusters are independent variables.

The study focuses on describing some basic characteristics of geography, population, age, number of students to distinguish clusters in the research paper.

Practical/managerial implications:

With the rapid development of information technology and the complex growth and expansion of economy, the existence of enterprises in any fields is directly proportional to their competition in the market. To enhance their competitive advantages over competitors, businesses need to realize the importance of identifying all factors related to their potential customers. Combined with geographic information management technology, administrators can effectively make decisions about where to reach and promote brands to customers, which is drawn from population data of each specific area.

Based on K-means clustering theory, Principal Component Analysis (PCA) with the assistance of Python programming application, this study has completed the analysis of population data of Hanoi in 2020 dicvided into four clusters with their own characteristics and shows the common clusters in the districts based on geodatabase of Hanoi City. Thence, creating a geographic market segmentation model for businesses wishing to learn and operate in the Hanoi area in the future.

Copyright
© 2021 The Authors. Published by Atlantis Press International B.V.
Open Access
This is an open access article under the CC BY-NC license.

Download article (PDF)

Volume Title
Proceedings of the International Conference on Emerging Challenges: Business Transformation and Circular Economy (ICECH 2021)
Series
Advances in Economics, Business and Management Research
Publication Date
7 December 2021
ISBN
978-94-6239-462-9
ISSN
2352-5428
DOI
10.2991/aebmr.k.211119.048How to use a DOI?
Copyright
© 2021 The Authors. Published by Atlantis Press International B.V.
Open Access
This is an open access article under the CC BY-NC license.

Cite this article

TY  - CONF
AU  - Le Thu HANG
AU  - Bui Nguyen Anh TUAN
AU  - Van Duc MANH
AU  - Nguyen Quynh CHI
AU  - Bui Thien BINH
AU  - Tran Ngoc DIEP
PY  - 2021
DA  - 2021/12/07
TI  - Building a Geo-Demographic Segmentation Model: the Case of Hanoi City, Vietnam
BT  - Proceedings of the International Conference on Emerging Challenges: Business Transformation and Circular Economy (ICECH 2021)
PB  - Atlantis Press
SP  - 534
EP  - 542
SN  - 2352-5428
UR  - https://doi.org/10.2991/aebmr.k.211119.048
DO  - 10.2991/aebmr.k.211119.048
ID  - HANG2021
ER  -