Factors of the international success of popular music compositions

The specifics of the world of music. Predicting commercial success in popular music. A theoretical model of the life cycle. Development of a model for international success of musical compositions: block diagram and operationalization of indicators.

Рубрика Музыка
Вид магистерская работа
Язык английский
Дата добавления 18.07.2020
Размер файла 1,6 M

Отправить свою хорошую работу в базу знаний просто. Используйте форму, расположенную ниже

Студенты, аспиранты, молодые ученые, использующие базу знаний в своей учебе и работе, будут вам очень благодарны.

The next parameter is musician's career stage - to establish whether a song is released by a beginner or an already long-standing musician. This is reflected by two indicators: number of released albums (including all singles, extended plays, compilation, remix albums, etc.) and number of years since first release. Also, the frequency of album releases (the ratio of the first to the last) is considered.

The last parameter is record companies, expressed in two empirical indicators:

1) Number of record labels that participated in the recording of an album/single/other, within which the song was released

2) Power of label, conditionally divided into three levels:

0 is self-released

1 is released at small record label

2 is released at large well-known labels with a great history and respect

In the situation where several record labels were taken part in recording, the highest value is used.

Described scheme with all indicators lays in the basis of the further modeling.

Data

Focusing on more objective information, we tend to avoid conduction any surveys. Therefore, all data will be collected from open sources and will reflect uncontroversial characteristics of music and artists.

There are several data sources we consolidate in the analysis. The first is Billboard Chart history. Billboard charts are charts of music releases of different directions, formed by Billboard Magazine on the basis of statistical data on sales, radio rotation and some other parameters depending on the nature of the music chart. The main charts are Hot 100 (top 100 songs) and Billboard 200 (top 200 albums). Data are available since 1958. This will help us to assess success in traditional way - through rank in chart.

Second source is Spotify charts. These charts work simpler than previous ones. They are based only on how many times a composition, or album, or artist, has been streamed. Data are available since 2016. This will help us to assess success in modern way - by demand in streaming services.

The third is `Million Song Dataset'. Data contain loudness, tempo, key, mode, duration and other objective characteristics per song. There are more than 2000 unique artists. Usually this dataset is used for genre classification training. Our goal is to firstly find songs from our top-list here and then add songs from the same year of release not charted by chosen sources.

Next sources are `Ultimate music database', `LastFM' and `Discogs'. They contain information about members, discography (date, label), awards, press references, various by artist, information about audio recordings, including commercial releases, promotional releases, and bootleg or off-label releases. This information will be used as source of additional information about performance of artists from our list.

Wikipedia is used for finding the rest of the information.

Sample

To start, the list of the most popular compositions was created. Billboard Hot 100 and Spotify top-200 charts were downloaded from official websites. Each chart was collected for 52 weeks of 2017 year. Thus, we have 10000 songs from Spotify and 5000 songs from Billboard. Obviously, most of the songs are repeated in the charts from week to week. So, this format was changed to unique songs. After all, two samples were overlapped, only 140 songs left. Somehow, this gives us the answer to reasonability of having two different approaches to success - traditional and modern. They are pretty much different and cover different branches of music. Anyway, in the end we have 140 the most popular songs of 2017 year.

More 140 songs were randomly selected from Million Song Dataset under the following conditions:

1) A song is released in 2017

2) A song is not charted by Billboard or Spotify

3) A song does not have any awards

The last condition was checking during the data enrichment process. If awards are found, the song is replaced with another randomly selected song from the dataset.

Year of the release was chosen for reason of convenience of data collection. This is the first full year Spotify has charts. Also, it has been some time since all these songs were released, and information about these songs has already appeared in the sources. This is especially topical when working with songs that are not on the top list.

Data collection

Data collection process was divided into 5 sequential stages:

1) Downloading lists of songs charted by Billboard and Spotify.

2) Downloading audio information from Million Songs Dataset Project.

3) Preparing scripts and parsing data from Wikipedia and Discogs for instruments, personnel information, record companies, musician's career stage, communication, and promotion of songs from the list of charted songs.

4) Processing and cleaning parsed data.

5) Scrapping by hands the rest of the information on top-listed songs and all the information on randomly selected from Million Songs Dataset songs.

At the first step of data collection titles and artists of the songs were downloaded directly from Spotify official website and Billboard Magazine official website. Officially, Spotify does not work in Russia, and I had to use virtual private network to connect to the platform. Spotify provides its charts by week, which you need to download individually. Billboard has ready packed files with the data on each year.

After the lists were downloaded, preprocessed, and merged, we have got number of cases we needed for random sample. From official website of Milliong Songs Dataset list of songs released in 2017 were downloaded. Firstly, the length of the list was greater than number of songs in top-list. Two reasons for that:

1) There were some songs, also targeted some charts, and won awards - such cases were excluded from the sample.

2) There were songs with absolutely zero information in open sources. Probably, data on these cases could be the most useful in the analysis, but we can not use data when we do not have it. That is one of the limitations of the research.

Milliong Songs Dataset Project provided the research not only with list of not successful songs, but also with technical audio characteristics: loudness, tempo, key, mode, duration.

The next step was collecting data on top-listed songs. Since songs are extremally popular, there are all needed information in the sources, Wikipedia and Discogs are well structured sources, and a well-filled Wikipedia sections allow to easily get all the necessary information using only the list of URLs of each page and the marked page structure.

Unfortunately, this does not work with not top-listed songs due to lack of information. For that reason, I had to look for the information from open sources by hands on Wikipedia, Discogs, LastFM and Ultimate music database. This part of the data collection was the most problematic. Some of the songs downloaded from Million Songs Project were excluded from the sample due to lack of the information.

The last stage of the data collection process continued till needed number of observations - 140 - were enriched with all needed information.

Summary of the chapters from above - operationalization of indicators, sample, data sources, data collection techniques is presented in the laconic form in table 1.

Methods and procedures

Current study proposes defined problem be solved as classification machine learning problem. Pop music compositions are marked as “successful” and “not successful”. Catboost classifier is implemented for binary classification modeling.

Brief Reference to Classification Tasks

The classification task is to get a categorical response based on a set of attributes. It has a finite number of answers (usually in the format "Yes" or "No"): whether the photo has a cat, whether the image is a human face, whether the patient has cancer.

Classical classification training dataset is Titanic which contains data from the ship's passengers. Each row represents a unique Titanic passenger, and each column contains quantitative or categorical attribute for each passenger: Passenger Id, Survived (that is the target - an event needed to be modeled and predicted based on the rest of the features), class followed by the passenger (1, 2, 3), name (not relevant for training), sex, age, number of siblings on the board, number of parents or children on the board, ticket [not relevant for training], fare, cabin and the port of boarding (3 names). The goal of the task is to build a model that can best predict whether an arbitrary passenger is still alive or not.

Classification task does not have to be binary; it can be multiclass. For example, port of boarding could be the target in the Titanic data if you are more interested in that then in `dead or alive' question. Nevertheless, our task is still binary: hit or not hit.

Back to the Music Success:
Model Setup

Following the example above, we can describe our problem in terms of classification machine learning task. Each row represents a unique song, indexed by title and artist, and each column contains quantitative or categorical attribute for each song:

1) Loudness

2) Tempo

3) Key

4) Mode

5) Duration

6) Instruments

7) Number of musicians

8) Average age

9) Share of women

10) Label power

11) Number of labels

12) Album number

13) Musician lifetime

14) Video

15) Tv show

16) Movie

17) Other media usage

18) Media usage [sum of 15-17]

19) live performance

There are 280 rows, 18 attributes, perfectly balanced classes. The goal of the task is to build a model that can define whether a song is a hit or not based on the attributes from above.

Number of cases randomly selected from the list of not charted songs equals to number of top-listed songs on the only purpose: quality of training. Algorithm works in such a way that reaches the highest accuracy. For that reason, working with dramatically skewed data (as we have in real life where tiny layer of musicians is successful) leads to the situation where algorithm trains to mark all the cases as `not popular' and accept hits as random noise. To avoid this, classes were generated perfectly balanced. From the other side, that leads to the situation where our real model quality will be far from that good as in training. Nevertheless, since we are interested in the modeling the influencers, but not in making predictions, we can be good with this.

The rationale for the selection catboost classifier as classification algorithm

Machine learning proposes different methods for solving classification tasks:

1) K-nearest neighbors

2) Decision tree

3) Logistic regression

4) Naive Bayes classifier

5) Support Vector Machine

6) Bagging ensembles

7) Boosting ensembles

KNN is one of the simplest classification algorithms. This is both an advantage and a disadvantage: it is often ineffective on real tasks. In addition to classification accuracy, the problem with this classifier is the speed of classification: if there are N objects in the training sample, M objects in the test selection, and the dimension of the space is K, then the number of operations for classifying the test sample can be estimated as O(K*M*N).

Decision tree is a method based on the use of a tree graph. The methodological advantages of the decision tree are that it structures and systematizes the problem, and the final decision is made based on logical conclusions. In other words, decision tree shows the highest level of interpretation the results - the model returns segments of the sample, defined concrete values of attributes, and reflecting accurate (observed) shares of classes in each segment.

Logistic regression is an algorithm to determine the relationship between variables, one of which is categorically dependent, and the others are independent. It is a powerful statistical method for predicting events that includes one or more independent variables. Logistic regression is used for predicting the probability of an event occurring by fitting data to the logistics curve. This is useful for credit scoring or estimating the probability of an earthquake on a specific date.

Naive Bayesian classifier belongs to the family of simple probabilistic classifiers and originate from the Bayes theorem, which, in this case, considers functions as independent (this is called a strict, or naive, assumption). Sometimes it works well, but assumption of independence often leads to distrust of stakeholders, and the interpretation of the results is not so intuitive. Usually, Naпve Bayesian classifier is used for face recognition and other patterns in images or to detect email spam.

SVM is a whole set of algorithms. Because an object located in N-dimensional space belongs to one of two classes, the support vector method builds a hyperplane with dimension (N - 1) so that all objects are in one of the two groups. The support vector method is equivalent to a two-layer neural network, where the number of neurons on the hidden layer is determined automatically as the number of support vectors. However, SVM suffers from instability to the noise and absence of feature selection algorithms.

The ensemble method is based on machine learning algorithms that generate multiple classifiers and separate all objects from newly received data based on their averaging or voting results. Bagging collects complicated classifiers, while simultaneously training the basic ones. Boosting converts weak models into strong ones by forming an ensemble of classifiers (from a mathematical point of view, this is an improving intersection). The ensemble methods are more powerful tool, because it minimizes the impact of randomness by averaging the errors of each base classifier and reduces the variance.

In turn, boosting is represented by many different algorithms: Adaptive Boosting, LPBoost, TotalBoost, BrownBoost, xgboost, MadaBoost, LogitBoost. Each of them has its own pros and cons, the comparison of which we will leave beyond current paper. The newest and the most powerful of them for now is a unique proprietary algorithm for building machine learning models implemented by Yandex as an open source python library.

CatBoost shows the same decent results in quality as LightGBM, XGBoost, H2O, but the latter are significantly inferior to it in learning speed. Yandex provides benchmarks on it, the screenshots from the official website are below at the fig.3 and fig. 4. Other important benefit of using catboost CatBosst classifier is its ability to deal with categorical data. Usually, classifiers require categorical data to be decomposed into set of dummies, or replaced with numerical values (gdp for countries, population size for regions, or simply ordinal numbers if the categories can be ranked). CatBoost is designed to take categorical data as input and mathematically choose the best way of treating them.

Fig. 3 Comparison of quality between CatBoost Classifier
and other boosting algorithms. https://catboost.ai/

Fig. 4 Comparison of Learning speed between CatBoost Classifier and other boosting algorithms. https://catboost.ai/

Software

All the analysis is performed in Python programming language. The reason for that is conditioned by availability of CatBoost Classifier. Version is 3.6.4. The list of libraries used:

1) Pandas 1.0.3

2) Numpy 1.16.4

3) Seaborn 0.8.1

4) Matplotlib 2.1.2

5) Catboost 0.23.1

6) Sckit learn 0.19.1

7) Shap 0.35.0

Parsing Wikipedia and Discogs was implemented in R. There was not a real reason for that - this part of data collection is just more convenient for me in R.

Analysis Pipeline

After all the data are collected and pulled together, we can proceed with modeling. Since all the data are gathered by me, I have a privilege to not deal with any data cleaning - everything is already done. There is no any NA's in the data or mistakenly produced outliers.

So, we can start right away with exploratory analysis. There are pairplot of all interval attributes and the target and a correlation heatmap of all attributes and the target at the figures 5 and 6.

From that we can see a strong correlation between `media usage' attribute two parts of that - `TV show' and `other media usage'. That is not a surprise because media usage variable is a sum of tv show, movie and other media usage. Catboost classifier does not have any problems with multicollinearity because the algorithm is simply looking for segments shaped by attributes. Individually, tv show, movie and other media usage have not so much non-zero observations. For that reason, I decided to join these three features in the single variable.

Fig. 5 Correlation heatmap of all attributes and the target.

Fig. 6 Pairplot of all interval attributes and the target.

Target variable shows high correlation with the only variable - video. This also can be discovered by contingency table (fig. 7).

Fig. 7 Contingency table of `having an official music video' attribute and the target.

That means that almost half of the sample does not have neither video, nor success. As we can remember, classes are perfectly balanced, and that means that almost none of the unpopular songs have a music video. At the same time, 35% out of 50% popular songs (what is 70% of popular songs) have a video.

From the pairplot we can see substantial difference between popular and unpopular songs in the distributions of average age, album number and loudness.

Since we have very clean dataset, we do not need to deal with any cleaning or preprocessing. Scaling the data is not necessary for CatBoost, but I prefer to do it before modeling. Scaling prevents difficulties in interpreting magnitude of individual attribute's importance. So, after exploratory data analysis the pipeline is:

1) Scaling the data in defined range

2) Define categorical features

3) Define a model

4) Split the sample into train and test subsamples

5) Fit a model

6) Adjust hyperparameters

7) Run a model with found best hyperparameters

8) Get the results

9) Check the quality on test sample

MinMaxScaler from scikit-learn library was used. It does transformation of features by scaling each feature individually such that it is in the given range. I am used to using range = [1,100], and this work quite well on the available data.

Categorical features are defining simple by number in the list of features and using further as model input. GridSearchCV method from scikit-learn library was used for adjusting hyperparameters. Final model has the following setup:

1) Iterations = 35

2) Depth = 5

3) Learning rate = 0.11

4) Evaluation metric = “F1”

I prefer F1 metric to the rest because it is balanced compromise between precision and recall. Business may pursue different interests and may use different metrics respectively, but for research purposes I prefer to use F1.

Results

Resulted model has extremely high quality metrics (table 2). Since there is no difference in quality between train and test sample, we can exclude overfitting. Therefore, we can trust our results: model is valid and stable. The only caveat is that these metrics do not show real ability to predict success because distribution of classes dramatically differ from real. Changes in precision and recall by quantiles are shown at the figure 8. For given distribution of target (50%), precision is 0.99 and recall is 0.99. The plot is generated on test sample.

Table 2Metrics of quality for train and test samples

Train

Test

Precision

0.9804

1.0000

Recall

0.9804

0.9706

Accuracy

0.9807

0.9851

F1

0.9804

0.9851

Sample

210

70

Fig. 8 Precision & Recall metrics of the model

Interpretation of the CatBoost results needs auxiliary tool - shap analysis. SHAP is abbreviation for SHapley Additive explanations. In fact, this instrument can be useful to explain the results of any machine learning model. I use it here because there is now direct way for interpreting CatBoost. We can use exploring of each tree separately or imply shap analysis. The method bases on classic Shapley values from game theory and links optimal credit allocation with local explanations. [Lundberg et al., 2020]

To look at importance of each attribute in the model, we can plot the shap values of each attribute. Shap values shows the distribution of each indicator's influence on the model output both magnitude and direction. The color means value of the attribute (cannot be applicable for categorical data). The plot is at the figure 9.

Using described approach, we can say that small number of labels reduces the probability of a song being successful. Having a video published increases probability for a song to be successful. Higher loudness is associated with higher probability for a song to be successful. Performing a song at live concerts increases probability for a song to be successful. Small number of albums released before a song decreases probability for a song to be successful. Using a song at media increases probability for a song to be successful. Powerful labels increase probability for a song to be successful. Being a single musician increases probability for a song to be successful. Being young increases probability for a song to be successful. Being at the beginning of the career, in contrary, decreases probability for a song to be successful. Being (or having in a band) a woman decreases probability for a song to be successful.

Remarkably, we can see some `grey zones' (actually purple) at the plot where it is impossible to distinguish popular and unpopular songs. For example, small number of albums released before a song does say about unpopularity, but high number of albums does not day about popularity. This is a good advantage of chosen method that we can see exact intervals at the attributes values when they are important. Importance of the rest of the indicators should not be considered.

Fig. 9 SHAP Summary plot of model's attributes importance.

The same results can be fetched from impurity-based feature importances. The results are at the table 3. This method of evaluating importance is based on relative rank (or depth) of an attribute in a tree. Attributes used at the top of the decision tree contribute to the final prediction a larger fraction of samples. Here method implemented in Sckit learn library was used. Sklearn combines the fraction of samples a feature contributes with the decrease in impurity from splitting them. As the result, we get a normalized estimate of the predictive power of each feature.

Table 3Impurity-based feature importance evaluation of model's attributes

features

importance

features

importance

number of labels

36.989342

tempo

0.598751

video

14.308478

key

0.485959

loudness

9.253934

duration

0.276561

mu

8.635178

other media usage

0.237553

album number

7.121988

movie

0.073576

live performance

5.874169

mode

0.000000

label power

5.417315

instruments

0.000000

average age

4.043978

tv show

0.000000

musician lifetime

2.757620

number of musicians

2.653738

share of women

1.271860

Summing up, the results show that idea of predicting commercial success of popular music compositions is not crazy. All technical information on quality model and attributes' individual importance is reported, discussion of the results is in the next section.

Discussion

Having all the results, we can go back to study's hypotheses.

1) Commercial success of music compositions is partly determined by technical audio characteristics: loudness, tempo, key, mode, duration, instruments.

The results of completed research show poor support to this hypothesis. The only attribute out of 6 tested that reveals importance is loudness. This inference seems quite trivial. My idea that a song's loudness is highly depended on just quality of a record. The rest of technical audio indicators did not come important in the resulted model. This corresponds to performed theoretical analysis where we have seen that musicians with very similar material may rich dramatically different levels of success.

2) Commercial success of music compositions is partly determined by personnel: number of musicians, average age, gender structure.

The results reveal a significant dependence of commercial success of popular music compositions and personnel. Being a single musician increases probability for a song to be successful. Being young increases probability for a song to be successful. Being (or having in a band) a woman decreases probability for a song to be successful. This reminds me about problem of glass ceiling. Does this really matter even in music industry? That could be an idea for extending the research.

In general, young single male musician look quite natural image of successful artist from common sense point of view. What is interesting is that while young age shows positive effect on success, start of the career shows the opposite. That identifies specificity of music industry: musicians often start their career very early. Rational explanation lays in the role of Internet and ease of access into public and media. A young girl or boy can easily make their first tries public and use mass's support (subscribers, likes, shares, etc) in looking for producers, fonds, grants, etc. Also, popularity of music TV shows as `Voice' or `BGT' makes the same effect.

3) Commercial success of music compositions is partly determined by musicians' career moment: album number, musician lifetime.

Both indicators of career moment show that hits do not come in the beginning of the career. Again, seems natural. More experience and more material return more profit. Hypotheses get empirical support from presented study.

4) Commercial success of music compositions is partly determined by aggressive communication and promotion: video production, participation in tv show, using a composition in movie, other media usage, live performance.

The results show that hypothesis is completely fare. All sorts of promotion - higher number of labels, having a video published, performing a song at live concerts, using a song at (any) media, powerful labels - increase probability for a song to be successful.

5) Commercial success of music compositions is partly determined by record companies' characteristics: label power, number of labels.

The last hypothesis is about importance of record companies. Number of labels participated in record and release comes the most important attribute in the model. Being released at powerful (large, old and eminent) label also shows positive relation with a songs' success.

Key intention of that study was to show importance of promotion and role of non-creative part of a musician's team. I think that having all the attributes of that factor significantly positive in the model output means that we have get what we expected.

Conclusions

Main interest of the study lays in the evaluation of factors which lead to commercial success in music industry and evaluate their importance. To do this, data on success and different characteristics of compositions were collected and statistical model built.

Five parameters of commercial success of popular music composition accordingly to five research hypotheses were tested: technical audio characteristics (loudness, tempo, key, mode, duration and instruments), personnel (number of musicians, average age, share of women), communication and promotion (video, usage in media, performing lives), career stage (number of albums released and years of musician presence), record companies (number and power of labels).

Focusing on more objective information, all the data collected from open sources and reflect uncontroversial characteristics of music and artists. Billboard and Spotify websites provides information about charted songs in 2017. Million Song Dataset was used to form alternative list of songs (random sample of unpopular pop music songs released in 2017) and enriching the data with technical audio characteristics. Ultimate music database, LastFM, Siscogs and Wikipedia were parsed by script and scrapped by hands for all attributes listed above.

Current study proposes defined problem be solved as classification machine learning problem. Pop music compositions are marked as “successful” and “not successful”. After conjunction of Spotify top-200 and Billboard Hot 100 charts 140 popular songs become a `positive' class for further classification. More 140 songs were randomly selected from Million Song Dataset and present `negative' class for classification. Catboost classifier is implemented for binary classification modeling.

Model setup was 280 observations, 18 attributes, perfectly balanced classes, scaled data, 35 iterations, 5-splits depth, 0.11 learning rate, optimizing `F1'. In the result, a model that define whether a song is a hit or not based on the attributes from above was built. All the analysis was performed in Python programming language. Resulted model has extremely high quality metrics.

The most important attributes the analysis revealed are number of labels, having a video, high loudness (presumably determined by high quality of a record), usage a song in media, number of albums released before a song, live performance of a song and label power. The results show pivotal role of promotion and non-creative part of a musician's team.

All hypotheses of the research were tested and came up with empirical support. Only technical audio characteristics (except loudness) do not show significant effect on probability of a song being successful. This strengthens the idea from which the current research problem was born. The way music sounds does not determine its success.

Though quality of the resulted model is extremally good, we need to remember that that is true only for artificially generated perfectly balanced sample. In reality, when commercial success in music industry is dramatically skewed, predicting success is considerably tricky. However, that fact only highlights importance of obtained results.

Conclusion

Presented paper is devoted to the uncertainty about skewed reward observed in the industry of popular music. Body of literature claims that small number of performers dominate music industry. This phenomenon cannot be explained exceptionally by quality of music. Raised Internet inflates completeness of the market. Thus, equally talented or slightly different in talent musicians differ dramatically in gained success.

Key research question of presented study is why very similar music can be dramatically different in gained success. Main goal of the study is to model success depends on various factors and determine the key success factors.

Scope of literature was scrutinized, and theoretical model was elaborated. We explored success and paths toward success from different perspectives: traditional vs. modern way of communication, local vs. international market, artistic vs. mass legitimation. In results, analytical model of reaching success from all these perspectives was formulated. Such factors as language, frequency of concert and releases, role of managers and stakeholders, promotion and marketing strategies, collaboration with outside professional record companies were considered.

Theoretical model representing all parts of musical product lifecycle and those elements of them which might be related to the success has been developed. Obtained model represents traditional way of music production and distribution and traditional success. The extent to which it can reflect modern scenario of success was discussed.

Main specificity of the study is integration of different aspects of the musical product, such as technical audio characteristics (loudness, tempo, key, mode, duration and instruments), personnel (number of musicians, average age, share of women), communication and promotion (video, usage in media, performing lives), career stage (number of albums released and years of musician presence), record companies (number and power of labels). That was the list of factors and indicators included in the resulted statistical model.

Based on the theoretical model of popular music lifecycle, a structural scheme for statistical analysis was elaborated. Each step of the lifecycle was presented in the form of measurable indicators. All the needed data on pop music compositions (their success indicator and influential characteristics from above) were collected from open sources by direct downloading, parsing websites by script and scrapping information by hands.

The problem of commercial success was presented as a classification machine learning problem. Pop music compositions were marked as “successful” and “not successful”. All songs selected for the analysis were released in 2017. Half of them were charted by both Billboard Magazine and Spotify streaming platform and therefore marked as `successful'.

Catboost classifier was implemented for binary classification modeling. Statistical model that defines whether a song is a hit or not based on the attributes from above was built. All the analysis was performed in Python programming language. Resulted model has extremely high metrics of quality.

It appears that the most important attributes for predicting commercial success of popular music compositions are number of labels, having a video, high loudness (presumably determined by high quality of a record), usage a song in media, number of albums released before a song, live performance of a song and label power.

Testing hypothesis of technical audio characteristics did not show significant effect on probability of a song being successful (except loudness). Loudness may reflect proxy for quality of a song, what may be not a reason for commercial success, but natural consequence of that.

Personnel factor reveal an evidence for glass ceiling problem: being single (without band) young man increases chances for success according to obtained results. However, single does not mean that musician acts alone. That means that listeners see the only face, but backstage workers do mean a lot for commercial success.

The results show importance of being associated with several labels and with eminent powerful labels. Of course, we cannot be sure what is the reason, and what is the cause. This relationship may work in both directions.

Also, being young does not mean being beginner. The analysis show intuitively understandable result that more experience (measured in years of presence in industry and released material) inure to the benefit. Along with advantage of being young, these results evidence for remarkably early start of music career among performers of commercially successful songs.

Communication and promotion appear also to be significantly important. Publishing video, having a song in TV show or in a movie, performing it at live concert are factors highly related to gained success.

Summing up, the results show pivotal role of promotion, communication, and association with powerful market players. This assures the idea from which the current research problem was born. The way music sounds does not determine its success. There is a lot more outside the sound.

The results we obtained reflect relationship between different factors of music production and performance, and musical compositions' commercial success. The paper expands existing literature on music scenes and suggests a complex way for advancing research on pop music compositions' success. Finally, it makes an important empirical contribution to musicology area. Whereas music is not pure creativity, but like any art it is an industry, the results of current study can raise interest not only among academics, but also among music industry decision-makers such as managers, labels, and producers.

Limitations

Resulted model is very good defined, stable and show extremally high precision and recall metrics. That means we can trust relationship we have discovered and rely on inferences we made. However, it could be a mistake to think that we are able to predict further success of popular music compositions so well. The problem is we have built a model on artificially balanced classes. In reality, distribution of classes is dramatically skewed, and that will lead to inflating the errors while trying to appy the model to real data. That does not mean that our model is bad. That means that model is good for understanding difference between popular and unpopular songs. But it does not need to good at making predictions with high accuracy on real data.

Resulted model also have some constraints in the setup from the very beginning - limited list of indicators. Exploring the literature, a more complex model analytical model was developed. However, not all the desired aspects were included in the model. The model can be extended with the rest in the future researches.

Collected data did not have any missing data or mistaken outliers or any other noise which usually results from procedural errors because only factual information, collected from valid sources were used. Although this does not mean that we did not miss anything. Collecting information on the songs from not charted list was the most difficult part. I did it by hands, because lack of information at websites makes automatized process of data collection inapplicable. What is more important, I had to skip some songs, selected firstly randomly, because I could not find any information on it. Thus, in fact, I had problems with missing data, and I solved it by full omission of incomplete rows.

All the limitations are described above for two reasons. First is to avoid incorrect interpretations and implications of the results. And second is to indicate opportunities for future research, which could be improved not only from meaningful point of view, but also in technical and methodological approach.

References

1. Adler M. Stardom and talent //The American economic review. - 1985. - V. 75. - №. 1. - p. 208-212.

2. Araujo C. V. S. et al. Predicting Music Success Based on Users' Comments on Online Social Networks //Proceedings of the 23rd Brazillian Symposium on Multimedia and the Web. - 2017. - P. 149-156.

3. Auslander P. Performance analysis and popular music: A manifesto //Contemporary Theatre Review. - 2004. - V. 14. - №. 1. - P. 1-13.

4. Balkwill L. L., Thompson W. F. A cross-cultural investigation of the perception of emotion in music: Psychophysical and cultural cues //Music perception. - 1999. - Т. 17. - №. 1. - P. 43-64.

5. Baumann S. A general theory of artistic legitimation: How art worlds are like social movements //Poetics. - 2007. - V. 35. - №. 1. - P. 47-65.

6. Belinfante A., Johnson R. L. Competition, pricing and concentration in the US recorded music industry //Journal of Cultural Economics. - 1982. - P. 11-24.

7. Berry L. L. Relationship marketing of services--growing interest, emerging perspectives //Journal of the Academy of marketing science. - 1995. - V. 23. - №. 4. - P. 236-245.

8. Brabec J., Brabec T. Music, Money and Success. - Schirmer Trade Books, 2011.

9. Caves R. E. Creative industries: Contracts between art and commerce. - Harvard University Press, 2000. - №. 20.

10. Chon S. H., Slaney M., Berger J. Predicting success from music sales data: a statistical and adaptive approach //Proceedings of the 1st ACM workshop on Audio and music computing multimedia. - 2006. - P. 83-88..

11. Cox R. A. K., Felton J. M., Chung K. H. The concentration of commercial success in popular music: an analysis of the distribution of gold records //Journal of cultural economics. - 1995. - V. 19. - №. 4. - P. 333-340.

12. Cramer W. F. The Relation of Maturation and Other Factors to Achievement in Beginning Instrumental Music Performance at the Fourth Grade Through Eighth Grade Levels : Doctoral dissertation. - Florida State University, 1958.

13. Crossley N. The man whose web expanded: Network dynamics in Manchester's post/punk music scene 1976-1980 //Poetics. - 2009. - V. 37. - №. 1. - P. 24-49.

14. DeNora T. How is extra-musical meaning possible? Music as a place and space for" work" //Sociological theory. - 1986. - V. 4. - №. 1. - P. 84-94.

15. Egermann H. et al. Music induces universal emotion-related psychophysiological responses: comparing Canadian listeners to Congolese Pygmies //Frontiers in psychology. - 2015. - V. 5. - P. 1341.

16. Ferer M. T. Music and Ceremony at the Court of Charles V: The Capilla Flamenca and the Art of Political Promotion. - Boydell Press, 2012. - V. 12.

17. Fisher C. M., Pearson M., Barnes J. A study of strength of relationship between music groups and their external service providers: impacts on music group success //Services Marketing Quarterly. - 2002. - V. 24. - №. 2. - P. 43-60.

18. Fisher C. et al. Business attributes of successful music groups: an analysis of three measures of success //Journal of Hospitality & Leisure Marketing. - 2001. - V. 8. - №. 1-2. - P. 137-148.

19. Staw P. D. J. A. The midnight disease: the drive to write, writer's block, and the creative brain. - 2004.

20. Florina P., Andreea M. Social media and marketing of the" popcorn" music wave: the success of Romanian commercial musicians analysed through their perceived image on Facebook and Youtube //Economics & Sociology. - 2012. - V. 5. - №. 2A. - P. 125.

21. Gordon E. Musical aptitude profile. - Houghton Mifflin, 1965.

22. Grandadam D. Networks, Creativity and the Finest in Jazz.

23. Hamlen Jr W. A. Superstardom in popular music: Empirical evidence //The Review of Economics and Statistics. - 1991. - P. 729-733.

24. Hiller R. S. The importance of quality: How music festivals achieved commercial success //Journal of Cultural Economics. - 2016. - V. 40. - №. 3. - P. 309-334.

25. Hirsch P. M. Processing fads and fashions: An organization-set analysis of cultural industry systems //American journal of sociology. - 1972. - V. 77. - №. 4. - P. 639-659.

26. Hracs B. J. Cultural intermediaries in the digital age: The case of independent musicians and managers in Toronto //Regional Studies. - 2015. - V. 49. - №. 3. - P. 461-475.

27. Hufstader R. A. Predicting success in beginning instrumental music through use of selected tests //Journal of Research in Music Education. - 1974. - V. 22. - №. 1. - P. 52-57.

28. Iliescu D., Petre D. Psychology of the advertising and of the consumer. - 2004.

29. Jarvin L., Subotnik R. F. Wisdom from conservatory faculty: Insights on success in classical music performance //Roeper Review. - 2010. - V. 32. - №. 2. - P. 78-87.

30. Juslin P. N. et al. Prevalence of emotions, mechanisms, and motives in music listening: A comparison of individualist and collectivist cultures //Psychomusicology: Music, Mind, and Brain. - 2016. - V. 26. - №. 4. - P. 293.

31. Kubrin C. E. Gangstas, thugs, and hustlas: Identity and the code of the street in rap music //Social problems. - 2005. - V. 52. - №. 3. - P. 360-378.

32. Lancaster K. J. A new approach to consumer theory //Journal of political economy. - 1966. - V. 74. - №. 2. - P. 132-157.

33. Lawson F. R. S. Music in ritual and ritual in music: A virtual viewer's perceptions about liminality, functionality, and mediatization in the opening ceremony of the 2008 Beijing Olympic games //Asian music. - 2011. - P. 3-18.

34. Lena J. C., Peterson R. A. Classification as culture: Types and trajectories of music genres //American sociological review. - 2008. - Т. 73. - №. 5. - С. 697-718.

35. Li T., Ogihara M. Music genre classification with taxonomy //Proceedings.(ICASSP'05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005. - IEEE, 2005. - V. 5. - С. v/197-v/200 Vol. 5.

36. Lingel J., Naaman M. You should have been there, man: Live music, DIY content and online communities //New Media & Society. - 2012. - V. 14. - №. 2. - P. 332-349.

37. Lizй W. Artistic work intermediaries as value producers. Agents, managers, tourneurs and the acquisition of symbolic capital in popular music //Poetics. - 2016. - V. 59. - P. 35-49.

38. Lundberg S. M. et al. From local explanations to global understanding with explainable AI for trees //Nature machine intelligence. - 2020. - V. 2. - №. 1. - P. 2522-5839.

39. MacDonald G. M. The economics of rising stars //The American Economic Review. - 1988. - P. 155-166.

40. MacDonald R., Kreutz G., Mitchell L. (ed.). Music, health, and wellbeing. - Oxford University Press, 2013.

41. Maddage N. C. et al. Content-based music structure analysis with applications to music semantics understanding //Proceedings of the 12th annual ACM international conference on Multimedia. - 2004. - P. 112-119.

42. Malcomson H. Composing individuals: Ethnographic reflections on success and prestige in the British new music network //twentieth-century music. - 2013. - V. 10. - №. 1. - P. 115-136.

43. Matsa D. A. Competition and product quality in the supermarket industry //The Quarterly Journal of Economics. - 2011. - V. 126. - №. 3. - P. 1539-1591.

44. Mauch M. et al. The evolution of popular music: USA 1960-2010 //Royal Society open science. - 2015. - V. 2. - №. 5. - P. 150081.

45. Mazzeo M. J. Product choice and oligopoly market structure //RAND Journal of Economics. - 2002. - P. 221-242.

46. Mislove A. et al. Measurement and analysis of online social networks //Proceedings of the 7th ACM SIGCOMM conference on Internet measurement. - 2007. - P. 29-42.

47. Negus K. Producing pop: Culture and conflict in the popular music industry. - London : E. Arnold, 1992.

48. Ordanini A. Selection models in the music industry: How a prior independent experience may affect chart success //Journal of Cultural Economics. - 2006. - V. 30. - №. 3. - P. 183-200.

49. Petrin A. Quantifying the benefits of new products: The case of the minivan //Journal of political Economy. - 2002. - V. 110. - №. 4. - P. 705-729.

50. Power And D., Hallencreutz D. Competitiveness, local production systems and global commodity chains in the music industry: entering the US market //Regional Studies. - 2007. - V. 41. - №. 3. - P. 377-389.

51. Presenza A., Iocca S. The weight of stakeholders on festival management. The case of music festivals in Italy //PASOS Revista de Turismo y Patrimonio Cultural. - 2012. - V. 10. - №. 2. - P. 25-35.

52. Rosen S. The economics of superstars //The American economic review. - 1981. - V. 71. - №. 5. - P. 845-858.

53. Roy W. G., Dowd T. J. What is sociological about music? //Annual Review of Sociology. - 2010. - V. 36. - P. 183-203.

54. Rutten P. Local popular music on the national and international markets //Cultural Studies. - 1991. - V. 5. - №. 3. - P. 294-305.

55. Steininger D. M., Gatzemeier S. Using The Wisdom Of The Crowd To Predict Popular Music Chart Success //ECIS. - 2013. - P. 215.

56. Strachan R. Micro-independent record labels in the UK: Discourse, DIY cultural production and the music industry //European Journal of Cultural Studies. - 2007. - V. 10. - №. 2. - P. 245-265.

57. Strobl E. A., Tucker C. The dynamics of chart success in the UK pre-recorded popular music industry //Journal of Cultural Economics. - 2000. - V. 24. - №. 2. - P. 113-134.

58. Trindade G., Silva T. M. T. C., da Conceiзгo Santos M. Determinants of the crowdfunding campaign success in the areas of music and sports //2017 12th Iberian Conference on Information Systems and Technologies (CISTI). - IEEE, 2017. - P. 1-6.

59. Uzzi B., Spiro J. Collaboration and creativity: The small world problem //American journal of sociology. - 2005. - V. 111. - №. 2. - P. 447-504.

60. Vaccaro V. L., Cohn D. Y. The evolution of business models and marketing strategies in the music industry //International journal on media management. - 2004. - V. 6. - №. 1-2. - P. 46-58.

61. Walser R. et al. Running with the devil: Power, gender, and madness in heavy metal music. - Wesleyan University Press, 1993.

62. Wolinsky A. Prices as signals of product quality //The review of economic studies. - 1983. - V. 50. - №. 4. - P. 647-658.

63.Zwaan K., ter Bogt T. F. M., Raaijmakers Q. So you want to be a Rock `n'Roll star? Career success of pop musicians in the Netherlands //Poetics. - 2009. - V. 37. - №. 3. - P. 250-266. Размещено на Allbest.ru

...

Подобные документы

  • Biography of Dean Nurpeisova. Speech of the famous Soviet musicologist Vladimir Belyaev about Dina and her success. The title of "People's Artist of the Kazakh SSR". Her career as the link between the classical past and the present state of dombra music.

    реферат [16,0 K], добавлен 10.07.2014

  • The best-known types of music: blues, classical, country, latin, jazz, electronic, metal, punk, reggae and other. The basic elements of music, rhythm, dynamics and sound properties are color and intensity. Learning styles and different genres of music.

    презентация [3,5 M], добавлен 01.06.2014

  • Music in ancient times, iconography in music. Ancient Chinese music and Imperial Office of Music. The Hurrian Hymn to Nikal in the ancient Hurrian language. Ancient Hebrew music, Greek music, Western Music. Styles and tendencies of 20th century music.

    контрольная работа [15,6 K], добавлен 18.07.2009

  • To determine the adequacy of the translation model, from difficulties in headline trаnslаtion of music articles. Identification peculiarities of english music press headlines. Translation analysis of music press headlines from english into russian.

    дипломная работа [602,6 K], добавлен 05.07.2011

  • Major factors of success of managers. Effective achievement of the organizational purposes. Use of "emotional investigation". Providing support to employees. That is appeal charisma. Positive morale and recognition. Feedback of the head with workers.

    презентация [1,8 M], добавлен 15.07.2012

  • Culture is one of the most important components, which form every nation. It is one occurrence that distinguishes and unites all the people who live in the world. But it is impossible to imagine the culture without music, a very big part of our life.

    реферат [12,8 K], добавлен 26.11.2004

  • Classical and modern theories of the international trade. Concept and laws of development of the international trade. Structure and the basic commodity streams of the international trade at the present stage of development. Foreign trade of the Russia.

    курсовая работа [15,8 K], добавлен 25.02.2009

  • The model of training teachers to the formation of communicative competence. How the Web 2.0 technology tools affect on secondary school students in communication. The objective of the model is instantiated a number of conditions. Predicting the Future.

    курсовая работа [30,3 K], добавлен 11.06.2012

  • Executive summary. Progect objectives. Keys to success. Progect opportunity. The analysis. Market segmentation. Competitors and competitive advantages. Target market segment strategy. Market trends and growth. The proposition. The business model.

    бизнес-план [2,0 M], добавлен 20.09.2008

  • Impact of globalization on the way organizations conduct their businesses overseas, in the light of increased outsourcing. The strategies adopted by General Electric. Offshore Outsourcing Business Models. Factors for affect the success of the outsourcing.

    реферат [32,3 K], добавлен 13.10.2011

  • Studies to determine the effects of fulltime and parttime employment on the academic success of college students, on time to graduation and on future earnings. Submission of proposals on how a university student employment offices may utilize these data.

    статья [62,1 K], добавлен 23.02.2015

  • Study the opinion of elderly people and young people about youth culture. Subculture as a group of people with the same interests and views on life. Passion for today's youth to heavy music, computers, dance parties and special styles of clothing.

    презентация [654,6 K], добавлен 28.10.2014

  • Louis Armstrong was the greatest of all Jazz musicians. He is considered the most important improviser in jazz, and he taught the world to swing. Childhood and youth of musician. His musical career, participation in popular films and international tours.

    презентация [847,4 K], добавлен 14.03.2011

  • History of development the world leader in the production of soft drinks company "Coca-Cola". Success factors of the company, its competitors on the world market, target audience. Description of the ongoing war company the Coca-Cola brand Pepsi.

    контрольная работа [17,0 K], добавлен 27.05.2015

  • Critical literature review. Apparel industry overview: Porter’s Five Forces framework, PESTLE, competitors analysis, key success factors of the industry. Bershka’s business model. Integration-responsiveness framework. Critical evaluation of chosen issue.

    контрольная работа [29,1 K], добавлен 04.10.2014

  • Применение современных компьютерных технологий в делопроизводстве. Реализация документооборота лингвистической школы "Success", как структурного подразделения КГОУ СПО ХПК, в среде "MS Outlook". Решение задач учёта и контроля исполнения документов.

    дипломная работа [3,5 M], добавлен 26.05.2012

  • Description of Ireland's geographical location, the capital and the symbolism of the state's population. Introduction to the Irish language, literature and music. The situation of the country abroad. The reasons for departure from the Irish government.

    контрольная работа [22,9 K], добавлен 08.02.2012

  • Currency is any product that is able to carry cash as a means of exchange in the international market. The initiative on Euro, Dollar, Yuan Uncertainties is Scenarios on the Future of the World International Monetary System. The main world currency.

    реферат [798,3 K], добавлен 06.04.2015

  • Moreover, cities are the centers of culture and social life. Living in a city one has all sorts of museums, music halls, theatres, exhibitions, movie theatres, pubs, restaurants and night clubs at their command.

    топик [6,9 K], добавлен 25.08.2006

  • A specific feature of services. The main form of supply of services abroad. A need for international regulation of trade in services. Operations on foreign tourism. International tourism as a form of foreign economic activity. World Tourism Organization.

    реферат [1,2 M], добавлен 30.09.2014

Работы в архивах красиво оформлены согласно требованиям ВУЗов и содержат рисунки, диаграммы, формулы и т.д.
PPT, PPTX и PDF-файлы представлены только в архивах.
Рекомендуем скачать работу.