Data Files for "Declining Chinese Attitudes Toward the United States Amidst COVID-19"

This dataset encompasses three distinct sets of data analyzed in the study, namely the survey data on favorability to the US, the survey data on trust in Americans, and the social media data.

This data is available at yuxie.com and Princeton DataSpace.

Survey Data on Favorability to the US

The first part of the dataset comprises the analysis in Study 1 and Study 3, which is collected from three surveys, including the Social Attitude Questionnaire of Urban and Rural Residents (SAQURR) in 2019 and 2020, the COVID-19 Multi-Wave Study (CMWS) between 2020 and 2022, and the Survey on Living Conditions (SLC) in 2023.

We append the data from the three surveys. The raw data at the micro level can be found in [survey-datasets-favorability.csv] and the questionnaire in [survey-questionnaire-favorability.docx], sufficing the replication of the results presented in Study 1 and Study 3. We disclose the relevant variables used in the research, including the favorability score, source of survey, year and month of the interview, and background information such as education and age, accompanying the weights.

Our study object is to examine the trends in attitudes toward America, so our sample is limited to only those who reported their favorability towards the US, containing 3,266 observations in SAQURR, 28,897 observations in CMWS, and  2,592 observations in SLC.

Survey Data on Trust in Americans

The second part of the datasets provides information used in Study 4, involving the 2018 and 2020 waves of the CFPS, Baidu Index data, and the COVID-19 cases and deaths data.

The China Family Panel Studies (CFPS), conducted by Peking University, is a nationally representative, longitudinal, comprehensive, and biennial social survey started in 2010. The outcome of interest in Study 4 is trust in Americans measured in the 2020 CFPS, incorporating the baseline trust from the 2018 CFPS. We confined the sample to respondents who indicated their level of trust in Americans in both the 2018 and 2020 waves (N=17,497). [survey-data-trust-descriptive-sample.xlsx] reports the trust level in 2018 and 2020 and the changes in between.

In the regression analysis, we provide the subsample of those who have the “potential” to decrease trust (trust scored above 0) and have complete information on location and interview date (N=11,430). They are interviewed at some point over the 23 weeks spanning from July 2020 to December 2020.

We measure the Chinese public attention to the pandemic in the US using the Baidu Index [https://index.baidu.com/v2/index.html]. Baidu is China’s Google. The Baidu Index provides query-based data that reflects the daily intensity of keywords entered into Baidu, the largest search engine in China. We applied a logarithmic transformation to the Baidu Index scores for the keywords “美国疫情” (“pandemic in the US”), “疫情” (“pandemic”) and “中美贸易战” (“Sino-US trade war”) to quantify public attention.

Our anlaysis in this part also involves the COVID-19 cases and deaths data obtained from the Oxford COVID-19 Government Response Tracker [https://www.bsg.ox.ac.uk/research/covid-19-government-response-tracker]. We used two measures with logarithmic transformation: the daily number of confirmed cases and the daily number of deaths occurring one day before the 2020 CFPS interview date. Due to the time difference between China and the US, these statistics are possibly the most up-to-date information available to the survey respondents who closely follow US news.

[survey-data-trust-analytical-sample.xlsx] reports variables include in this data including the trust in Americans in 2018 and 2020, demographic variables, and location details (province) from CFPS, along with the merged data of Baidu Index and the COVID-19 cases and deaths data.

Social Media Data

The third dataset is provided to depict trends in attitudes toward the US in Study 2. The data is collected from 17,182,483 posts containing US-related keywords (美国, 灯塔国, 美利坚, 米国, 美帝) from January 1, 2016, to December 31, 2022, on the Chinese social media platform Weibo, which is similar to Twitter. The substantial size provides us with a high level of confidence that this dataset encompasses prevalent viewpoints on Chinese social media. Each post was labeled with an attitude score toward the US on a scale of -2 (most unfavorable), -1 (somewhat unfavorable), 0 (neutral), 1 (somewhat favorable), and 2 (most favorable). Subsequently, we employed fine-tuning on a large language model, BERT, using these annotations for two tasks. The first task involved binary classification to determine whether a Weibo post conveyed attitudes toward the US. The second task was a regression model to predict the attitude score.

The daily attitude averaging across all users is provided in [media-data-average-opinion-us.csv], smoothed using a 360-day sliding window to filter out minor fluctuations.

Data Publisher

COVID-19 Multi Wave Study (CMWS) and Survey on Living Conditions (SLC) are conducted by Population Development Studies Center, Renmin University of China. Social Attitude of Urban and Rural Residents Survey (SAURRS) is conducted by Institute of Psychology of Chinese Academy of Sciences. China Family Panel Studies (CFPS) is conducted by Institute of Social Science, Peking University. The Weibo data is owned by Sina.