Crowdfunding: cross-country analysis
Analysis of projects at crowdfunding venues in different countries. Methods to attract interest from potential sponsor. Managerial factors affecting the increased likelihood of a successful start-up project. Overview of the Russian crowdfunding platform.
Рубрика | Менеджмент и трудовые отношения |
Вид | дипломная работа |
Язык | английский |
Дата добавления | 04.12.2019 |
Размер файла | 5,2 M |
Отправить свою хорошую работу в базу знаний просто. Используйте форму, расположенную ниже
Студенты, аспиранты, молодые ученые, использующие базу знаний в своей учебе и работе, будут вам очень благодарны.
Unfortunately, there is a limited number of these platforms in Russia, basically there are two almost equal dominant ones: Boomstarter and Planeta, and less than a dozen of small platforms, for instance, Kroogi, ThankYou.ru, Together, Naparapet etc.
The main question was which one of the two we should pick, and what criteria do we need to use to make a right decision. At first, we were contemplating about using the oldest platform between them, however, the two leading ones, Planeta and Boomstarter, were founded in summer 2012, so there is almost no time lag between both of them. Then, after studying reviews and articles about our potential data sources, we have chosen Boomstarter to gather data from for a number of reasons, stated below.
First of all, it is the fact that it has bigger number of projects (as it stated on their sites - Boomstarter has 7497 projects with 372 million rubles collected (Boomstarter, 2019) in total sum, whereas Planeta possesses slightly more than 5 thousand ones with 1 billion rubles collected (Planeta, 2019)). Secondly, according to many articles, Planeta is considered to be attractive and more oriented on artists, as a result, the major categories on platform are: 18% -- music, 16% -- movies, 15% -- literature (Planeta, 2019). Whereas Boomstarter is more product oriented one and it has more diversified projects (more than 20 categories). This point was highly important for us, due to the fact that one of our hypotheses aims to check the relationship between category types and collection rate. Also, as we found out, Boomstarter is more attractive for entrepreneurs due to its rather low tariffs in comparison to the other ones; creator pays 5000 rubles to publish his or her project (it was made for platform to avoid a possibility of being overwhelmed by crazy or unserious projects made for a joke (Osipov & Buseva, 2017) and has a commission of 3,5% if the project funded 100% and more or 0% if collection rate is lower than 100% (Boomstarter, 2019). While Planeta charges 10% if project achieved more than 100% of its goal, 15% is collected sum is between 50% and 99,9%, if less than 50%, the author pays no commission (Planeta, 2019). Planeta makes an exception only for charitable project, in this case it takes only 5,9% if collection rate if higher than 50%, but still it much worse conditions than offers Boomstarter.
After choosing the platform to get data from, we wrote a Python code using library called “BeautifulSoup” for scrapping data from the site. Downloading data from this source is legal, due to the fact that, according to the Russian Federation Law on Personal Data, the processing of personal data is allowed, if it is not used by a company, and access to which was provided by the owner himself, which means we may collect and use it for research purpose. For the research, we decided to collect the data on all projects existing on Boomstarter. The projects were placed there from august 2012 until the present day (meaning the collection date, approximately 24-26th of November 2018).
Then, after retrieving data from Boomstarter, we got data on 7947 projects, 500 of which still were in the progress. In order to get more accurate results, we decided to exclude not finished projects from our database, as far as we don't know what percentage they will collect in the nearest future, thereafter, these projects may add a certain degree of error.
In the dataset, taken from Boomstarter, we could get the following categories of data: targeted sum for projects, the percentage of the targeted sum, which was collected up to the end of collection date, the actual sum, which was collected for the project, the authors' biographies (a text, where author list his own information, experience, the reason to start the project, etc.), the project title, the project website link, author's social media links (Facebook and Vkontakte ones), the project category (art or social project, etc.), the number of investors taking part in the project, the number of projects, launched by author, the country and city of residence of the author and the state of the project (successful or not, if it was finished or not finished), the first and last name of the author. The projects, placed on Boomstarter were run mainly by Russian people from different parts of the world (there are 60 countries, including Russia in our dataset). However, there are only 73 projects, which were run abroad, which allows us to call this platform Russian-specific. To make our research cross-county are going to add date from the world-dominant platform - Kickstarter.
After reviewing the Boomstarter data, we understood, that our research, as much as several previous ones would be more profound and fulfilled, if we add there a demographical data about the authors of the projects. This meant we should collect the data about the creators of the projects from Vkontakte and Facebook. As far as the Facebook regulations did not allowed us to collect the data, we decided to use only Russian social network - Vkontakte. After scrapping the data using Python from the chosen social network using special documentation VK API published legally on site for developers, we got the next information about the author: the birth date, gender, first, middle and last name, the number of followers, the type of occupation (e. g. if the author is currently studying, or working, on retirement etc.), the city and country of origin and residence, religion, political preferences, main traits liked in people, main objectives in life, education status and form, university name, graduation year, university program name, attitude to alcohol and smoking, page status (if the page is active or deactivated), and lots of other data.
Despite the huge volume of the data, retrieved from Vkontakte (nearly 70% of projects were appended by data from Vkontakte), we used only the small part of this data in our research, this happened due to the fact that some authors didn't left any information about themselves, so adding such variables, with small number of observations in our analysis would lessen our sample significantly. Thereafter, the representativeness of our sample would decrease. Finally, in our analysis we used next variables, taken from Vkontakte social network: author's age, number of followers and gender. Next, we're moving to the description of variables, used in our analysis, the list of which may be seen in table 2.
Table 2 Variables description and source.
Variable |
Description |
Unit of measure |
Source |
|
City_num |
Population of the city of origin of the author |
1 - From 0 - 249'999 people 2 - From 250'000 - 999'999 people 3 - 1 million and more people |
Rosstat; Boomstarter (city of origin retrieved from Boomstarter, the categories by population - from Rosstat) |
|
Category |
Project category (1-10) |
1 - Education 2 - Other 3 - Business 4 - Lifestyle 5 - Food 6 - Social entrepreneurship 7 - Design 8 - Entertainmen 9 - Technologies 10 - Art |
Boomstarter |
|
Goal |
The targeted sum which is set by the author to pledge up to the end of the project closure date. |
RUB |
Boomstarter |
|
Pledged |
The sum of money pledged till the end-of-collection date |
RUB |
Boomstarter |
|
Percentage |
The percentage of the targeted sum, pledged for a project at a moment |
Percent |
Boomstarter |
|
Per_category |
The binomial variables indicating successfulness of the project based on a percentage collected |
0 = 0% funded, 1 = more than 0% but less than 100% funded |
Boomstarter (processed) |
|
Per_category2 |
0 = 0% funded, 1 = 100% and more funded |
Boomstarter (processed) |
||
Per_category3 |
0 = more than 0% but less than 100% funded, 1 = 100% and more funded |
Boomstarter (processed) |
||
Biography |
The information about the author, published by him on a platform |
Number (author's biography length in symbols) |
Boomstarter |
|
Sites |
The information about web-sites, where the information about the project can be found |
1 - Author has a site, 0 - No |
Boomstarter |
|
|
The availability of the link to author's facebook account. |
1 - the link is indicated, 0 - no link is indicated. |
Boomstarter |
|
Sex |
The gender of the author of the project, written at Vkontakte social network |
1 - Female 0 - Male |
Vkontakte |
|
Age |
Author's age |
Years |
Vkontakte |
|
Followers_ count |
The number of followers of the author of the project, written at Vkontakte social network |
Number |
Vkontakte |
Because of high volume data from vk.com, we decided to optimize it to achieve the accurate results, for instance, in case of city of origin of an author, we had more than 700 unique values, that could not be used to get accurate results and to identify common features between the creators. So that, we have decided to generate a new categorial variable called city_num. We corresponded city titles with their population found in Rosstat, then separated achieved results into three categories: 3 - above one million people, 2 - between 250'000 to 999'999, 1 - less than 249'999).
To make our study cross-cultural, we resolved to add data from one of the world-famous international platforms. The choice was between Indiegogo and Kickstarter; however, it was clear that picking the second one is easier as data on its project is regularly published in free access on open sources such Kaggle. As a result, we downloaded data from Kaggle on 375, 389 projects collected on February of 2018 with total sum funded 3,4$ billion dollars (at present day total number of projects exceed 440 thousand with more 4,3$ billion dollars). The site has developed conditions for authors that are really close to what Boomstarter has. An author pay is charged for 5% if the project received 100% or more or pays no commission if less. It is important to note that the platform all or nothing funding system which means that if author does not achieve 100% of targeted sum, he or she will not receive money at all, the contributions will be returned to investors by the system.
The dataset consisted of rather small number of data categories: targeted sum for projects in dollars, the percentage of the targeted sum (generated), date of the launch and deadline, project's name, pledged sum in dollars, subcategory, category, author's country, currency type, project's state (cancelled, suspended, failed, successful).
As a result, our variables are listed in the table 3:
Table 3 Kickstarter's variables description
Variable |
Description |
Unit of measure |
Source |
|
Goal |
The targeted sum of money which is set by the creator to collect until the deadline. |
American dollars |
Kickstarter (Kaggle) |
|
Pledged |
The sum collected until the end date |
American dollars |
Kickstarter (Kaggle) |
|
Percentage |
The percentage of the targeted sum, given to author at the moment |
American dollars |
Kickstarter (Kaggle), generated in Excel |
|
Country |
Country wherein the project was launched |
The USA, Australia, the United Kingdoms, Canada. |
Kickstarter (Kaggle) |
|
Per_category 2 |
The binomial variables indicating successfulness of the project based on a percentage collected |
1 - more than 100% funded, 0 - 0% |
Kickstarter (Kaggle), generated in Stata |
|
Category |
Project category (1-10) |
1 - Comics 2 - Publishing 3 - Dance 4 - Game 5 - Food 6 - Journalism 7 - Design 8 - Entertainment 9 - Technologies 10 - Art |
Kickstarter (Kaggle) |
As well as in case of Boomstarter, to receive accurate results we have united categories in thematical groups to less the number, as a result the number shortened from more than 60 to 10 in each case. As we mentioned, the category types vary from platform to platform, in our case only 5 out of 10 coincide. Concerning the countries presented on Kickstarter, we have left only four of them (that had decent number of observations) which are the USA, Australia, the United Kingdoms, Canada so that our dataset decreased to 259'668 observations.
3.2 Methodology
In our research we used the data from 4 different sources, which means our data set contains a high number of missing values. Thereafter, the data, we used for analysis contains much less observations than there were in our primary data set, retrieved from Boomstarter. It is a result of adding several variables from Vkontakte and omitting observations with missing variables, when conducting analysis. That is why our descriptive statistics contains only 1871 observations overall, as it is shown in table 7. In this part of our research we're going to describe our analysis step-by-step. As for Kickstarter data set, we have much less variables, but the stable amount of observations, which can be seen in table 6.
First of all, we decided to make a descriptive statistic of our variables. As far as project success is the main focus of our research, we decided to make a categorical variable reflecting if the project was successful or not. However, in our data set, there are also projects, in which case we could not decide whether it was successful or unsuccessful, due to the fact, they have collected a certain amount of money, but didn't reached 100% of targeted sum. Thereafter, we divided our success indicating variable into three categories:
(1)
After segregation of success variable, we decided to run a mean-comparison test in Stata with successfulness as a group defining variable. In order to see the differences from country to country, we made this test for Russia separately (as far as we have more variables in data set for Russia) and for four other countries, which were placed on Kickstarter separately to check if there are differences in means differences' significances from country to country.
The mean-comparison test compares values of two means with or without a certain condition or rule. As a result of ttest (a STATA command for running mean-comparison test), we got three tables, which were then united into one table (table 6) with significances of differences between means of all variables by successfulness groups. As a result, we reached the preliminary list of factors, influencing project success for 5 different countries. As far as these results are just statistical ones, we may not call them a truly contribution in determination of factors influencing project's success. Thereafter, the next step of our research was preparation of our data for probit regressions run.
First of all, we'd like to indicate that we run a probit regression model due to the fact that our dependent variable did not have normal distribution, which happened as the result of the fact that crowdfunding is still not as highly developed mechanism and there are still lots of unsuccessful projects. Thereafter, we decided to make a percentage variable a latent one in order to omit lack of normal distribution bias. We'll describe all the mechanism of work with the latent variable further. Now, we're going to check the possible cross-correlation of variables, that we are going to use in our regression models.
In our research we decided to use corr STATA command, as far as this command allows omitting all the observations with missing values for all the correlations, which is useful if we want to see the cross correlation of the exact variables with exactly those number of observations we'll have in our regression model (Stata 13 documentation, 2014). In resulting tables (table 4 and 5) we may see the correlation coefficients between variables and the significance level of correlation coefficients to be starred. This means that the starred coefficients are significant on 5% level or higher. The correlation coefficients themselves show the degree to which the two variables are related to each other. The coefficients may range from -1 to 1 and the two correlated variables are the more related to each other, the nearer the correlation coefficient modulo to 1. Firstly, we're going to check possible correlations between variables for Kickstarter platform data.
In a table 4 we may see the result of running the correlation matrix for Kickstarter crowdfunding platform. In the matrix, the data includes exact observations which will be further used for probit regression models. Thereafter, we've got only nearly 260 thousand observations, which is a result of decreasing the number of categories to fit Boomstarter platform data.
As we may see in table 4, our dependent variable, the percentage funded for a project has no correlation both to the targeted sum for a project and to the number of investors, due to the fact that both correlation coefficients tend to zero, which means that correlation between them and percentage funded is almost equal to 0. However, the correlation coefficient between the number of investors and the percentage funded is significant on 5% probability level. This means that there may be a chance of heteroscedasticity between these two variables, which means that we should be attentive when adding this variable to our regression model. Moreover, as we may see, the correlation coefficient between the number of investors and the targeted sum for a project is both insignificant and tending to zero, which means that heteroscedasticity won't exist inside our regression model.
Table 4 Correlation matrix for Kickstarter data set
Variables |
(1) |
(2) |
(3) |
|
(1) Percentage funded for a project |
1.000 |
|||
(2) Targeted sum for a project, 000's $ USD |
-0.000 |
1.000 |
||
(3) Number of investors, people |
0.017* |
0.003 |
1.000 |
|
Observations |
259'668 |
|||
* shows significance at the .05 level |
The correlation table for Boomstarter platform data may be seen in a table 5. As far as one of our mostly fulfilled with variables regression model for Russian data will consist of all these variables, we decided to run this type of correlation matrix, in order to avoid missing values and make all correlations between them for the same sample of 1871 observations.
Table 5Correlation matrix for Boomstarter data set.
Variables |
(1) |
(2) |
(3) |
(4) |
(5) |
(6) |
(7) |
(8) |
|
(1) Percentage funded for a project |
1.000 |
||||||||
(2) Targeted sum for a project, 000's RUB |
-0.087* |
1.000 |
|||||||
(3) Author's biography length in symbols |
-0.035 |
0.064* |
1.000 |
||||||
(4) Author's followers number |
0.044 |
0.029 |
-0.022 |
1.000 |
|||||
(5) Author's age |
0.008 |
0.024 |
0.151* |
0.038 |
1.000 |
||||
(6) Gender (1- female, 0 - male) |
-0.039 |
0.078* |
-0.004 |
0.015 |
0.059* |
1.000 |
|||
(7) Sites (1 - project has a website, 0 - no) |
0.120* |
-0.037 |
0.089* |
0.026 |
0.062* |
-0.026 |
1.000 |
||
(8) Facebook (1- author has Facebook account, 0 - no) |
0.063* |
-0.015 |
0.130* |
0.001 |
0.132* |
-0.011 |
0.107* |
1.000 |
|
Observations |
1871 |
||||||||
* shows significance at the .05 level |
First of all, we'll have a look on the correlation coefficients between dependent variable and all others. As a result of our correlation matrix analysis, we've got that targeted sum por a project is negatively correlated to percentage funded for a project and this coefficient is significant of 5% probability level. However, the correlation coefficient is less than 0,3 module, which means that there isn't even a weak correlation between these two coefficients. The next variable is the length of author's biography. This variable is also negatively correlated to percentage collected for the project, however, it is insignificant on 5% level. As with the previous variable, its correlation with percentage is inconsequential, as far as it tends to zero. The author's follower's number, as much as the author's age correlations to percentage collected are almost not correlated to percentage collected, as far as the correlation coefficients are also tend to zero. Moreover, these two coefficients are insignificant on 5% probability level. Next, the correlation coefficient of gender and percentage collected for the project is negative and insignificant on 5% level. As all the other coefficients shown, these two variables are also almost not correlated due to the fact that coefficient modulo is rather low and tends to 0. The last two variables, the existence of website and Facebook account, also have rather low correlation coefficients. However, still we may say that their correlation coefficients are positive and significant on 5% probability level.
Finally, as we can see, in all cases of independent variables, the correlation coefficients are rather small and tend to zero, which shows that independent variables are not correlated to our dependent variable. However, in this case it means only that there is no linear interdependence between them. By running correlation matrix analysis, we still cannot answer if changes in our independent variables may cause change in percentage collected for the project. Nevertheless, despite there is no linear relationship between our dependent and independent variables, the significance of several coefficients tells us that there is some relationship between them. So, we may say that targeted sum, website existence and Facebook account existence are those factors, which are better to take into consideration in further analysis to understand whether there is a kind of influence of these variables on our dependent variable. However, these variables are also those, which may raise the probability of heteroscedasticity appearance in regression model.
Next, we're going to check the possible cross correlation of independent variables. First of all, as we may see, all the correlation coefficients of targeted sum and other variables, except percentage collected, which we already mentioned, are rather small, which means that the correlation between them is almost absent. However, the biography length and gender variables have significant correlation coefficients with targeted sum on 5% probability level, which means that still we should be aware of heteroscedasticity of these variables in regression model. Same for all the rest variables cross correlations, as far as they have low correlation coefficients, we may say that they have no linear relationships, however there are still few significant correlation coefficients. Firstly, author's biography length has a significant coefficient with the age of the author, as much as with Facebook account and website existence variables. Next, gender variable has the significant coefficients with age, Facebook and website ones. Finally, Facebook account existence has a significant coefficient with website existence. To sum up, we need to be careful with all these significant correlated variables as far as this may cause the heteroscedasticity in a regression model and increase the probability of an error. Next, we're going to choose variables to include in a model and the type of the model, that we'd better to apply to our data.
3.3 Regression models description
In our data set, there are several categorical and several continuous independent variables, as it is shown in table 1. When running regression, it is important that the range of all dependent and independent variables was similar. Thereafter, we decided to add our targeted sum, follower's number, author's age and biography length as logarithms of these variables, in order to make them with similar range. As for categorical variables, which are city_num and project categories, we decided to check the influence of all of them on our dependent variable by adding categorical variables to the regression model through i.category_num and i.city_num, which allowed us to see the influence of all categories on our dependent variable. Next, in our data we also have binary variables, which we as well will include in a model as other continuous variables.
Firstly, we decided to concentrate on cross-country research and build regression models for each of 5 countries in order to understand if there are differences in most influential factors for projects success by country. As far as the Boomstarter platform is still developing its capabilities, there are still much more unsuccessful projects, which can be seen in table 2. Thereafter, the percentage pledged for the project doesn't have the normal distribution. Moreover, as we've seen in results of the correlation matrix analysis, there is a high risk of heteroscedasticity in our model. Thereafter, we decided to use probit regression models to understand what changes in main factors influencing project success have the greatest influence on the successfulness of the project. Probit regression models are binary response models and are run with dichotomous or binary dependent variables. Thereafter, the response variable can take only two outcomes: positive and negative (or one and the opposite one), which means we get the success or failure of certain event and takes 0 and 1 values. Binary outcome models can be also given a latent-variable interpretation. It means, that we assume, that the outcome of the project financing on any crowdfunding platform, which is whether the project will or won't be funded, is determined by unobservable utility index, also called a latent variable, which means that there are some covariates or explanatory variables in a model, which are determined that the larger the value of the latent variable (all explanatory variables in a model), the greater the probability for the project to be funded 100%.
In order to check the differences in main influential factors for project success cross-country, we decided to choose the next binary dependent variable:
, (2)
where: per_category - our dependent variable in a model.
As a result, we'll get 5 regression models result, by which we may understand which factors are distinctive in different countries for raising the probability of project success.
In case of Russian platform, we decided to run several probit regression models, with and without the author-specific features in order to understand whether the project-specific or the author-specific characteristics influence the project success more. Now we need to set up a binary dependent variable, where:
, (3)
However, our data set has more complicated structure, due to existence of the range of projects collected percentages between 0% and 100%, which we cannot call successful, because they didn't reach the targeted sum, however, we also cannot name then unsuccessful, cause these projects reached at least certain amount of money. Thereafter, we decided to run several regression models with different binary outcomes. The first option of binary variable was shown above, and the percentage collected will be the latent variable in all models, that we'll set. The content of latent variable is that this variable is actually estimated by probit model, however, introduced by binary variable (in the first case, the continuous variable “percentage collected” is divided into 2 outcomes - 0 and 1 (success and failure). Thereafter, the next probit model will include next variable as a dependent one:
, (4)
Finally, our third regression will have the next binary dependent variable:
, (5)
Now we may build an overall probit regression models equations. Our cross-country regression models may be overall described by the cumulative distribution function of project success:
,
where: Pr - cumulative distribution function;
в - coefficient in regression;
в2*category1+?+в6*category5 - project categories on Boomstarter and Kickstarter.
Our Russian data regression models may be overall described by the next cumulative distribution function of project success:
, (7)
where: Pr - cumulative distribution function; в - coefficient in regression;
[] - social network variables, reflecting founder-specific factors, and appearing only in half of regression models;
в2*category1+?+в6*category5 - project categories on Boomstarter and Kickstarter;
в8*citynum1+?+в10*citynum3 - by population city category.
As a result of all regression models, we'll understand which factors and how influence the percentage collected for the project both on the international and Russian market. However, as for the probit regression models, this is an incorrect statement. In any probit model the coefficients before the independent variables are reflecting the changes of probability of getting 1 in per_category caused by changes in independent variable for 1 point. However, the result of coefficients influencing the probability of the project success is just a part of the probit regression analysis.
The next step is to make goodness-of-fit measures (pseudo R-squared and Hosmer-Lemeshow goodness-of-fit test in our case) and area under receiver operating characteristic curve (AUC) measure. The pseudo R-squared, or so-called McFadden pseudo R-squared, is a coefficient with a range from 0 to 1, which shows how well our model fits the data, which means, that it is a proportion of the data set, which may be predicted by a model (Cameron & Trivedi, 2009, p. 345). The sense of this test is to compare the maximum likelihood of the full model (our built one) with an empty one (the one with the intercept only and covariates equal to 0). The result, we'll get will show whether the model is able to distinguish between classes (categories of dependent variable). The Hosmer-Lemeshov goodness-of-fit test examines the sum of the squared differences between the observed and expected number of cases per covariate pattern divided by its standard error. In the test, the null hypothesis states that there is no significant difference between the observed and the predicted value. The chi-square test will also return a p-value. If it is less than the significance level, we may reject the null hypothesis that the data comes from the specified distribution (Cameron & Trivedi, 2009, p. 412). Finally, AUC is a square between the ROC curve and the axial fraction of false positive classifications. The ROC curve graph shows the percentage of the positive values of dependent variable, correctly classified by the model out of all positive values (the observations with 1 value of our dependent variable). The perfect regression model, which is expected to ideally separate successful and unsuccessful projects, has an AUC coefficient of 1. After all of these tests, we'd be sure that all the models were made correctly and we're able to choose the best describing model describing the project success on Russian crowdfunding platform. After this, we'll move to the marginal effects' analysis (Hernбndez-Orallo, 2013).
As a final part of analysis, we decided to count marginal effects on the probability of funding the project to the certain degree, calculated as Average of Marginal Effects (AME). The results of the probit regression models would give us the list of coefficients в, indicating to which extent changes our latent variable if we change our covariates by 1 unit. However, this kind of interpretation is not enough in order to set up the equation, indicating changes in the probability of success when changing factors, which influence it. The reason is that the probability response on the covariates change differs for different observations. Thereafter, the next step we need to make is to count the marginal effects of our probability change. Marginal effects reflect the extent to which the probability of obtaining the 1 in y (our dependent variable), which in our case means to collect 100% of the targeted sum for a project, changes if we change the covariate for 1 unit. As far as we know that each observation has unique probability change in response to 1 unit change in covariate, we need to determine the way to count the marginal effects for the whole data set.
Here are two options how to count marginal effects. The first one is to make average marginal effects estimation. The essence of this method is next: we firstly estimate the marginal effects for all the observations in a model (to which extent the probability of success increases when increasing covariates per 1 for concrete observation), further we count the average of the obtained values to reach the average marginal effect of the certain covariate. The second method is called the marginal effect of the average observation. It is estimated as the marginal effect of the average covariate observation, which means, that we firstly count the mean for a chosen covariate, afterwards we count the marginal effect for this value. In vast majority of the econometric literature it is stated that there is no difference between the two methods, thereafter, we decided to use the first one and estimate average marginal effects (Gujarati & Porter, 2006, p. 567).
In a cross-country regression analysis, we're going to see which factors in which country have the greater influence on the probability of success of the project on crowdfunding platform. For now, after reviewing literature, we may assume, that the targeted sum for a project should be equally significant for the success probability in all countries. However, for now we consider, that the significance of certain categories may differ from country to country. Moreover, we are going to check if there is a nonlinear relation of the targeted sum and success probability in any country and if it is significant.
As for the Russian crowdfunding platform particular features estimation, we'd like to make pay attention to significance of the author's features influence on the probability of success. As far as the authors' data was retrieved by us from VK social network, we are pioneers in studying data from this source in the context of the crowdfunded projects development. Thereafter, we'd like to understand if this country-specific social network may influence on the project success on crowdfunding platform in Russia the other way, then it was by Facebook data, studied in previous researches for Kickstarter platform, for instance, by (Koch, 2016) and (Johnson et al., 2018).
After estimating the marginal effects, we decided to plot them with graphs. This will help us to understand the main tendencies of influence of marginal effects on change of the success probability. In the next section of our research, we'll describe the results and main implications of this study.
4. Results description
As the primary estimation of the factors influencing project success, we decided to choose the descriptive statistics with a mean-comparison test. We'll firstly look at the results of the cross-country descriptive statistics table, containing the data from both Kickstarter and Boomstarter.
To see the differences between successful projects and those which collected almost nothing, we decided to make a variable, indicating whether the project is successful (with percentage collected equal to 100% or more), which is a third category in table 6, or it collected a certain amount of money (more than 0% but less than 100% collected), which is a second category in a table 6, and the projects which were absolutely unsuccessful (0% collected), which is the first category in tables 6 and 7.
The results of consolidated descriptive statistics for two platforms may be seen in table 6. The differences in means given for successful, unsuccessful and projects with less than 100% but more than 0% funded. By reviewing this table, we want to indicate whether there are differences in targeted sums and investors number for successful and unsuccessful projects between countries.
Table 6 Descriptive statistics (mean, standard deviation, difference between means for different project success levels) for Boomstarter and Kickstarter projects.
Variable |
(1) |
(2) |
(3) |
|
0% funded |
More than 0 but less than 100% funded |
100% and more funded |
||
Russia |
||||
Targeted sum for a project, 000's $ USA |
3749.724ЎЎЎ |
193.896 |
156.429 |
|
(1.68e+05) |
(282.564) |
(232.913) |
||
Number of investors, people |
2.161ЎЎЎ |
17.400*** |
135.722ЧЧЧ |
|
(3.305) |
(25.838) |
(281.428) |
||
Observations |
3,191 |
2,442 |
1,104 |
|
USA |
||||
Targeted sum for a project, 000's $ USA |
93.239ЎЎЎ |
60.951*** |
9.845ЧЧЧ |
|
(1721.144) |
(1261.868) |
(25.356) |
||
Number of investors, people |
0ЎЎЎ |
19.663*** |
220.942ЧЧЧ |
|
(0) |
(60.179) |
(1225.917) |
||
Observations |
29,617 |
106,726 |
82,752 |
|
Great Britain |
||||
Targeted sum for a project, 000's $ USA |
162.059ЎЎЎ |
45.711** |
8.103ЧЧЧ |
|
(3916.408) |
(456.685) |
(21.173) |
||
Number of investors, people |
0ЎЎЎ |
18.465*** |
155.131ЧЧЧ |
|
(0) |
(67.535) |
(494.641) |
||
Observations |
3,734 |
11,929 |
8,879 |
|
Variable |
(1) |
(2) |
(3) |
|
0% funded |
More than 0 but less than 100% funded |
100% and more funded |
||
Canada |
||||
Targeted sum for a project, 000's $ USA |
101.640Ў |
63.339 |
7.717ЧЧ |
|
(1932.657) |
(1191.393) |
(12.752) |
||
Number of investors, people |
0ЎЎЎ |
18.025*** |
228.952ЧЧЧ |
|
(0) |
(47.315) |
(1457.354) |
||
Observations |
2,017 |
5,464 |
2,894 |
|
Australia |
||||
Targeted sum for a project, 000's $ USA |
98.131 |
84.114 |
8.515ЧЧЧ |
|
(713.216) |
(1604.975) |
(15.325) |
||
Number of investors, people |
0ЎЎЎ |
17.346*** |
262.089ЧЧЧ |
|
(0) |
(54.103) |
(1069.770) |
||
Observations |
1,069 |
3,203 |
1,384 |
|
Note: *** p<0.001 mean (1) to (2) difference; ЧЧ p<0.01 mean (2) to (3) difference; Ў p<0.05, ЎЎ p<0.01, ЎЎЎ p<0.001 mean (1) to (3) difference |
The first and the most important implication from this table is that for Kickstarter data, when 0 percentage funded for a project, there are 0 backers for this project, which is controversial for the Boomstarter data. This may be the result of the differences in regulations of the two platforms, however, the most significant consequence of this observation is the fact that we may not estimate the influence of the number of investors on the project success by a simple probit regression with the dependent dichotomous variable, indicating whether the project succeed or not, due to the fact that our 0 indicator has the single value for number of backers variable, which affects the whole regression model significantly. Thereafter, we decided to run one more probit regression by countries with another binary variable, equal to 0 for projects, collected less than 100% but more than 0%, and equal to 1 if the project collected 100% and more. Further we'll describe the result of the two probit regression models by countries.
The next important implication of the descriptive statistics for countries is that there are distinctions between mean differences of the targeted sums for different countries. The only similar fact, which is true to all countries, is that the more successful the project, the less targeted sum it has, cause the mean targeted sum is decreasing with the increase of percentage collected. Thereafter, the minimal mean targeted sum is in the most successful category, indicating projects, collected 100% and more. As we may see from the table 6, Russia is the highest contrast country in the context of the targeted sums for projects. The difference between mean targeted sums is insignificant when comparing successful projects with those which collected something, but not all the fund. However, if we look at the projects with 0% collected in Russia, we'll see that the difference between them and successful ones in significant on the 1% probability level, which means that the difference in mean targeted sums between successful and unsuccessful projects has only 1% probability to be equal to zero. That is due to the fact that unsuccessful projects mean targeted sum is 24 times higher than the successful projects' one, however, the projects, which collected something but less than 100%, have insignificant difference in mean targeted sum comparing with successful projects. Thereafter, this fact shows, that in order to gain 100% targeted sum on Russian crowdfunding platform, one should have the funding expectations to his project of nearly 150'000 RUB. Unless this sum may not be suitable for substantial startups from different industries, we have to point out that Boomstarter crowdfunding platform exists only 7 years and still has not such popularity as Kickstarter.
As for other countries, the situation is pretty much the same and the tendency is similar. The difference between mean targeted sum for unsuccessful projects and those which collected at least something, but not 100% of the sum, is much lower than the difference between those projects which collected something but the whole sum and successful ones. However, in case of all countries except Australia, the differences between targeted sums by success categories are significant. However, due to the fact of the high standard deviation for mean targeted sum for projects collected more than 0% but less than 100%, we may not fully rely on this result. Moreover, the difference between mean targeted sum for unsuccessful projects and those which collected at least something, but not 100% of the sum for Australia is similar to the one for other countries. We consider the main reason of this consistency is the fact that the projects were taken for different countries, however, from the same platform. Thereafter, the tendencies remained constant despite the country of origin of the project.
Moving on to the differences in number of investors among countries and success categories. First of all, if we look at the Russia numbers, we may see that the difference is always significant on 9% significance level. However, the difference between successful projects category and two others is the greatest. The 135 to 17 and to 2 mean investors number is a great difference. However, the tendency is pretty much logical and obvious, so we may move to other countries. As we may see, the Russian increase in number of investors through categories remained the same for all other countries. Moreover, as we may see in the second column of the table 6, the mean investors number for those projects, obtaining less than 100% but more than 0% of the targeted fund is similar for all countries, which gives us the understanding that crowdfunding itself works similarly in all countries, by the principle “the more investors, the more money collected”. Finally, we may conclude our descriptive statistics analysis.
As we may see, there are two main tendencies, which we were supposed to receive. The first is that the relation between the targeted sum and the percentage funded for a project is negative. This fact was already discussed and proved when analyzing the correlation matrix and now, while viewing the descriptive statistics. The second one is the fact that the more investors the project attracts, the higher percentage of the targeted sum the project may collect. Next, we want to see which other factors may influence the project success, based on the data of Russian crowdfunding platform. Thereafter, we're moving on to the Boomstarter descriptive statistics analysis.
Table 7 Descriptive statistics (mean, standard deviation, difference between means for different project success levels) for only Boomstarter projects with author-specific variables
Variable |
(1) |
(2) |
(3) |
|
0% funded |
More than 0 but less than 100% funded |
100% and more funded |
||
Targeted sum for a project, 000's RUB |
494.84ЎЎЎ |
195.05*** |
162.95 |
|
(1128.69) |
(292.12) |
(271.46) |
||
Author's biography length in symbols |
480.20 |
553.46 |
475.77 |
|
(696.99) |
(973.52) |
(509.21) |
||
Author's followers number |
504.74Ў |
1269.32*** |
4004.92 |
|
(1932.07) |
(5782.79) |
(42220.18) |
||
Author's age |
35.00 |
35.23 |
36.59 |
|
(13.76) |
(12.34) |
(12.42) |
||
Gender (1- female, 0 - male) |
0.85ЎЎЎ |
0.74*** |
0.73 |
|
(0.36) |
(0.44) |
(0.44) |
||
Sites (1 - project has a website, 0 - no) |
0.70ЎЎЎ |
0.83*** |
0.89ЧЧ |
|
(0.46) |
(0.38) |
(0.31) |
||
Facebook (1- author has FB, 0 - no) |
0.47ЎЎЎ |
0.62*** |
0.66 |
|
(0.50) |
(0.49) |
(0.48) |
||
Observations |
795 |
699 |
377 |
|
Note: *** p<0.001 mean (1) to (2) difference; ЧЧ p<0.01 mean (2) to (3) difference; Ў p<0.05, ЎЎ p<0.01, ЎЎЎ p<0.001 mean (1) to (3) difference |
The Boomstarter descriptive statistics may be found in table 7. First of all, as we may see in a table 7, the difference between the mean targeted sums is statistically significant on 99% level only in case we compare the projects, which collected nothing with other projects (those, which were successful and those which collected certain amount of money but did not reach the targeted sum). Moreover, if we look at the mean difference between the projects with 100% and more result and less than 100% but more than 0%, we may see that there is almost no difference. Moreover, as we see, there is difference between unsuccessful and successful projects in targeted sum in more than two times (the mean targeted sum for unsuccessful projects is twice higher than mean sum for projects, which collected at least something, and 3 times higher than for successful projects). As a result of reviewing targeted sums differences, we may say that if the author wants to collect at least something on Boomstarter, he would better put a targeted sum amounted to 200'000 RUB.
Next, moving on to the author's biography length. Biography on Boomstarter is a text space, where the author may tell investors about his background, about his contribution to this or other projects, or even about the project or its idea itself, or just leave this space blank, as far as this field is not required and arbitrary. We decided to count the number of symbols in biographies and make it one of the control variables in our regression models, which will be discussed later. We made this in order to understand if investors take into account author's background when making decision to invest and whether the length of this arbitrary story influence the successfulness of the project. As a result of summary statistics test, we may see that there is almost no difference between mean biography lengths for different success categories. We may only guess, that those projects, which collected a certain amount of money, but not enough, may had slightly longer biographies, than those unsuccessful and successful ones, which means, and may be this was one of the factors, which stopped potential investors in their ambition to invest. However, as far as these three numbers are really slightly differ from each other, we may say that there might be other factors influencing the successfulness more, than the length of the author's biography.
When retrieving data from Vkontakte social network, we understood, that the users (project authors in our case) are more prone to be active communicators, when they have a greater number of followers, it was proved in case of Facebook by (Mollick, 2014) and (Johnson et al, 2018). Thereafter, we decided to include this variable in our analysis. As a result, which you may see in table 7, we got that the higher the number of followers the author has in social network, the more chances to success his project has on Boomstarter. When getting data from social network, we also saw that some users posting details of their project on their social network pages, which means, they may promote the project there and find more investors by means of social network. As we may see in all success categories the difference between mean number of followers is extremely significant. An exception is the category with successful projects, which has a high standard deviation, which influenced its difference with unsuccessful projects category. However, even though the successful projects are different to unsuccessful ones only on 90% significance level, as we may see from the table 7, the projects, which collected at least something are significantly different from the projects, which collected nothing on the 99% significance level by the number of author's followers. So, we may say, that author's activity in social network matters, and the ones with successful projects, probably the ones with great number of followers Vkontakte.
Next, we turn to the author's age variable, it was included in study by (Ryua & Young-Gul, 2018) while exploring motivation behind creating projects for crowdfunding projects but not success rate so we decided to add it to check its impact. It is the second variable, characterizing the project author. As we may see in a table, the age of the project author is not a characteristic, influencing success at all. All success categories have the same mean age, even the standard deviations are similar. Thereafter, we may say that, after reviewing the statistics, we may conclude that age of the author is not the main factor to understand whether the project will be successful or not.
The last author characteristic in variables list and not the last overall is the gender variable. Gender is a binary variable, where 1 is female, 0 is male, and its influence on the successfulness of entrepreneurial projects was studied by multiple studies such as (Geiger & Oranburg, 2018) and (Anglin et al., 2018). As we may see, the largest proportion of female entrepreneurs was detected in unsuccessful category of projects, which means, that out summary statistics confirms the previous researches results.
Next, the sites variable, which shows whether the project has its own website and it is specified on the project's page. The variable is also binary and equals to 1 if the project has a website and 0 if the project has no website. As it is indicated in a table 7, the successful projects category has the greatest proportion if the projects with websites. However, by only viewing the summary statistics, we cannot state if the existence of the website raises the probability of the project success. But we may say that nearly 9 out of 10 projects in successful category have a website and only 7 out of 10 in unsuccessful. The same situation happens while s...
Подобные документы
Critical literature review. Apparel industry overview: Porter’s Five Forces framework, PESTLE, competitors analysis, key success factors of the industry. Bershka’s business model. Integration-responsiveness framework. Critical evaluation of chosen issue.
контрольная работа [29,1 K], добавлен 04.10.2014Selected aspects of stimulation of scientific thinking. Meta-skills. Methods of critical and creative thinking. Analysis of the decision-making methods without use of numerical values of probability (exemplificative of the investment projects).
аттестационная работа [196,7 K], добавлен 15.10.2008Evaluation of urban public transport system in Indonesia, the possibility of its effective development. Analysis of influence factors by using the Ishikawa Cause and Effect diagram and also the use of Pareto analysis. Using business process reengineering.
контрольная работа [398,2 K], добавлен 21.04.2014Searching for investor and interaction with him. Various problems in the project organization and their solutions: design, page-proof, programming, the choice of the performers. Features of the project and the results of its creation, monetization.
реферат [22,0 K], добавлен 14.02.2016About cross-cultural management. Differences in cross-cultural management. Differences in methods of doing business. The globalization of the world economy and the role of cross-cultural relations. Cross-cultural issues in International Management.
контрольная работа [156,7 K], добавлен 14.04.2014Analysis of the peculiarities of the mobile applications market. The specifics of the process of mobile application development. Systematization of the main project management methodologies. Decision of the problems of use of the classical methodologies.
контрольная работа [1,4 M], добавлен 14.02.2016Стратегический менеджмент: теоретические подходы к исследованию. История возникновения стратегического менеджмента, его функции и принципы. Start-up как стратегия развития бизнеса. Реализация start-up проектов в России, анализ самых успешных из них.
курсовая работа [122,3 K], добавлен 16.12.2014Impact of globalization on the way organizations conduct their businesses overseas, in the light of increased outsourcing. The strategies adopted by General Electric. Offshore Outsourcing Business Models. Factors for affect the success of the outsourcing.
реферат [32,3 K], добавлен 13.10.2011Nonverbal methods of dialogue and wrong interpretation of gestures. Historical both a cultural value and universal components of language of a body. Importance of a mimicry in a context of an administrative communication facility and in an everyday life.
эссе [19,0 K], добавлен 27.04.2011Программный комплекс Project Expert, оценка его возможностей и функциональные особенности, структура и основные элементы. Microsoft Project как наиболее популярный в среде менеджеров малых и средних проектов. Программный комплекс Primavera, его функции.
курсовая работа [262,4 K], добавлен 06.01.2011Цели, задачи и методы управления строительным проектом. Методология управления проектом посредством пакета Rillsoft Project 5.3. Создание работы в таблице Гантта. Краткий обзор использования основных команд и инструментов системы Rillsoft Project 5.3.
курсовая работа [1,7 M], добавлен 24.05.2015The main reasons for the use of virtual teams. Software development. Areas that are critical to the success of software projects, when they are designed with the use of virtual teams. A relatively small group of people with complementary skills.
реферат [16,4 K], добавлен 05.12.2012Types of the software for project management. The reasonability for usage of outsourcing in the implementation of information systems. The efficiency of outsourcing during the process of creating basic project plan of information system implementation.
реферат [566,4 K], добавлен 14.02.2016Отработка приемов создания и моделирования экономической части стратегических планов организаций в системе Project Expert 6. Анализ влияния финансовых рисков на эффективность инвестиционных проектов. Составление отчета сформированного программой.
контрольная работа [27,5 K], добавлен 25.12.2015Создание и моделирование экономической части стратегических планов организаций в системе Project Expert 6. Построение финансовой модели предприятия и определение потребности в финансировании. Анализ финансовых результатов. Формирование и печать отчета.
курсовая работа [92,4 K], добавлен 16.12.2015Description of the structure of the airline and the structure of its subsystems. Analysis of the main activities of the airline, other goals. Building the “objective tree” of the airline. Description of the environmental features of the transport company.
курсовая работа [1,2 M], добавлен 03.03.2013The concept and features of bankruptcy. Methods prevent bankruptcy of Russian small businesses. General characteristics of crisis management. Calculating the probability of bankruptcy discriminant function in the example of "Kirov Plant "Mayak".
курсовая работа [74,5 K], добавлен 18.05.2015Понятие проекта и общие принципы управления при сетевом планировании. Анализ деятельности менеджера компании, построение сетевой модели и расчёт показателей календарного плана, оптимизация модели. Создание проекта для АРМ менеджера в среде MS Project.
курсовая работа [565,4 K], добавлен 17.06.2012Value and probability weighting function. Tournament games as special settings for a competition between individuals. Model: competitive environment, application of prospect theory. Experiment: design, conducting. Analysis of experiment results.
курсовая работа [1,9 M], добавлен 20.03.2016Formation of intercultural business communication, behavior management and communication style in multicultural companies in the internationalization and globalization of business. The study of the branch of the Swedish-Chinese company, based in Shanghai.
статья [16,2 K], добавлен 20.03.2013