Factors of successful protection from pressure on business
Concept and economic essence of property rights. Justification and development of the business protection model against possible damage to business activities caused by the influence various external and internal market factors and economic conditions.
Ðóáðèêà | Ýêîíîìèêî-ìàòåìàòè÷åñêîå ìîäåëèðîâàíèå |
Âèä | äèïëîìíàÿ ðàáîòà |
ßçûê | àíãëèéñêèé |
Äàòà äîáàâëåíèÿ | 11.08.2020 |
Ðàçìåð ôàéëà | 5,0 M |
Îòïðàâèòü ñâîþ õîðîøóþ ðàáîòó â áàçó çíàíèé ïðîñòî. Èñïîëüçóéòå ôîðìó, ðàñïîëîæåííóþ íèæå
Ñòóäåíòû, àñïèðàíòû, ìîëîäûå ó÷åíûå, èñïîëüçóþùèå áàçó çíàíèé â ñâîåé ó÷åáå è ðàáîòå, áóäóò âàì î÷åíü áëàãîäàðíû.
Considering all the previously described cases, it is possible to state, that the behavior of the business representatives depends on the general state-business power balance situatuon. Studying Korean and Philippians experience in crony capitalism development, Kang [Kang, 2002] have the similar results. The researcherfound out that the business-state relationships can be different under the crony capitalism, but the main features of this system are the same: political instability, weak rule of law, corruption [Ivanovic, 2017]. Thus, Kang provided an analytical schema describing how state-business relations can vary over time under crony capitalism.
There are four main states in such relations which depend on the coherence of state and business: laissez-faire (fractured state and dispersed business), predatory state (coherent state and dispersed business), rent seeking (fractured state and concentrated business) and mutual hostages (coherent state and concentrated business) [Kang, 2002. p.15]. The idea behind such classification is that in order to provide the greater competitiveness of the economy under the situation of imperfect institutions, each state-business relation at the exact point of time can be fixed in one of these positions [Kang, 2002].
Depending on the type of state-business relations and applying “strong-state” (for type III) and “weak-state” (type II) corruption schemas, we can describe different combinations of business-state relations. Type IV is the safest situation when both government and business do not have enough power to seek for economic or political power [Kang, 2002]. There are no or little bribes, highly competitive environment and to chance for state or business to get over each other [Kang, 2002]. Rent seeking stage (type II) comes with the weak state but strong business that have enough resources to impose its own political directly. In this case there are strong business groups, which have enough resources to impose their will over the government, which, in fact, does not have real power. On the contrary, type III can be characterized by strong state that prevails over business. State agents are usually seeking for rent in this case. Finally, in case of type I both business and state are powerful enough to harm each other “but are deterred from such actions by the damage that the other side can inflict” [Kang, 2002].
The schema in figure 1is a combination of theoretical findings about the property rights and practical findings from the previously held studies.
Figure 1.Types of State-business relations (created by the author)
Using this schema, we can understand which type of state-business relationship describes the current situation in Russia in a better way.
The main issue of applying described above concepts is that we cannot be exactly sure about how to identify modern state-business relations in Russia. It can be stated that Russia have unique state-business relations experience since three types of these relations have been passed for the last 30 years: from political and economic instability of the 90-s to weak state-strong business in early 00-s and strong state in late 00-s, early 2010-s. Several significant viewpoints were stated during past years.
According to the presented above definition of crony capitalism, business-state relations in Russia are can be characterized as a crony capitalism case.First, as it was stated in Ivanovic [Ivanovic, 2017], there are two mutually not exclusive types of business-political integration: backward-integration (business over politics) and forward integration (state over business) [Ivanovic, 2017, p.50-51]. The same ideas were stated in [Yakovlev, 2005] under the terms “state-capture” and “business-capture” as these types of integration had place in Russia during past 30 years. Thus, political instability was achieved as a result of the fall of Soviet Union. Unlike Croatian experience, the period between 1990 and 1998 is considered to be the period of a “weak state” with politically and economically prevailing strong business groups [Yakovlev, 2005]. Due to the lack of PR protection mechanisms business got over political organizations, such as Parliament, which was extremely weak in terms of ability to provide credible commitment and maintain the order of law.
After the year 2000, inner politics can be described by the strengthen of state apparatus, including the rise of taxes and providing credible PR security mechanism by creating appropriate judicial system [Yakovlev, 2005, p.13].Other evidences of such transition may be found in pressure government exert on business elites (or oligarchs) in 2000's. For instance, the famous “Yukos case” may be considered as an attempt to deal with business elite, which had enormous power after 1990's. Another words, the transfer from “state-capture” to “business-capture”, or forward-integration (state over business), had happened [Yakovlev, 2005]: rapid changes in political-economic balance were not accompanied by relevant institutional changes and PR of private companies was secured purely. Thus, state agents got an opportunity to use their intuitional positions in order not to create rent but to follow their personal will and to extract rent [McChesney, 1987]. For now, it seems like Russian state-business relations have changed from “rent-seeking” to “predatory-state” stages. Despite the fact, that a number of governmental and non-governmental institutions for PR-protection have been created since early 2000s', the topic of business PR security from predatory elements of government is still a matter of great importance [Yakovlev, et all. 2014].
Second, it is hard to believe that in situation of “rent-seeking” or “predatory-state” relations there were enough legally established institutions, which provide state-business communication. It is more likely that in early and middle 1990s' there were a little of law and “state-capture” process was carried out by high level of corruption or another actions, which led to the coalescence of state and business.
Finally, there was no smooth transition to market economy: in his paper “Russia's Economy under Putin: From Crony Capitalism to State Capitalism” Simeon Djakonov argues, that due to the strong communist elite, presence of great amount of natural resources and unfair privatization processes, which had led to the creation of great business groups, the transition to crony capitalism was performed in early 1990's [Djakonov, 2015. p.2-4]. “The loans-for-shares scheme, started in 1995 prior to the presidential election early in the following year, provided for the transfer of ownership in several state-owned natural resource enterprises to major businessmen in exchange for loans to the government. It led to the creation of large financial-industrial groups with influence on the government” [Djankov, 2015].
On the other hand, despite the great governmental control over all political and economic situation late 2000s' can be characterized by a greater number of legal PR-security institutions, such as business associations [Yakovlev, 2005]. Moreover, speaking in terms of Salter [Salter,2014], the toolkit of “corruption” (informal mechanisms) has also changed. For instance, Szakonyi [Szakonyi, 2018] had proved that parliament membership of one of the company's key stakeholders canhas a positive effect on the profitability of the enterprise [Szakonyi, 2018, p.336]. Sakaeva [Sakaeva, 2012] describes a parliament as an “opportunity window” [Sakaeva, 2012, p.106] entrepreneurs use in order to gain more political connections through direct and indirect strategies.
Other researchers insist that the current state of affairs differs from classical definition of crone capitalism: Djaknov [Djankov, 2015] describes the transition between crony capitalism and state capitalism “where the state either owned the main productive assets outright or they were held by personal friends of the president…”. However, using such definition, the difference between state capitalism and crony capitalism is not quite clear because both have the same features.
Another approach to study state-business relations can be found in more descriptive sociological studies. Leonid Kosals [Kosals, 2006] in his paper “Interim outcome of the Russian transition: clan capitalism”argues that “crony capitalism” is one of the forms of clan capitalism when the dominant clan take over economic and political power over the whole system. In case of clan capitalism, “the main source of the development within this system is the competition among clans”[Kosals, 2006, p.24]. The author states, that such a “clan war” was possible to establish due to institutional weaknesses in early-mid 90's during the contradiction between a number of elites, such as oligarch'sclans, old soviet clan and new political clan.
Even though the concept of crony capitalism as a part of clan system was poorly described by L. Kosals, the paper gives two significant insights on the methods businesspeople use in order to deal with institutional imperfections in terms of clan capitalism system. First, the structure of clan was described. The author argues, that each clan consists of the same structural positions (in priority descending order) [Kosals, 2016. P.21]. In addition to initial description, I divide these positions on inner and outer circles.
Inner circleconsists of:
1) Chieftain is a key clan member. He is responsible for team recruitment and political relationships with people outside the clan. Chieftain also gains most of the benefits and use charismatic type of leadership [Kosals, 2016. P.13].
2) Core members are key members of Chieftain's team. They support leader in his everyday decision-making processes and act like a representative of the leader in different spheres of clan activities (trade, finance, etc.). Core members are usually the closest friends of the leader [Kosals, 2016. P.15].
Outer circle consists of :
3) Well-paid specialists hired by core members. These people have a little knowledge about clan actives and do not connected with the clan leader. These people have more functional role.
4) Ordinary specialists and workers do not have any information about clan activities. Their primary goal is to serve for business entities created by clan members.
5) Last but not least are agents of influence. “They render important services for clan providing insider information, warning about any jeopardy, trying to direct the policy to the sake of the clan. The biggest clans have their representatives at the top of the power” [Kosals, 2016. P.18].
Moreover, among themselves clans have their own hierarchy: there is a top-clan and several submissive clans on each level of political or economic ladder [Kosals, 2016. P.23].
The second important idea that was mentioned in this paper is an assumption about the existence of social network of personal relations that glues clans together: “therefore hidden social network was operating not as links between individuals but as the system of hidden relations within certain social group glued by the personal trust between members” [Kosals, 2016. P.7]. That brings us to the idea that is more often discussed in sociological papers rather than in economic studies: personal and professional relations do matter in case of private PR security mechanisms in imperfect institutional environment. For instance, links between both clan members and clan members and third parties play crucial role in imperfect institutional situations adaptation processes. Key stakeholders, the members of inner circle, have a wide range of possibilities to reorganize inner processes, such as hiring policy, or influence agents of influence in order to gain benefits.
The confirmation of the importance of state-business agent's interpersonal relations can be found even in both network [Razo, 2015] and not-network studies. For instance, Sakaeva, [Sakaeva, 2012. P.112] notes that St. Petersburg's legislative assembly is more like a relational marketplace for the entrepreneurs where they can exchange their reputation to PR security. As it was mentioned earlier, Szakonyi [Szakonyi, 2018] found the relation between Parliament participation and the increase of the enterprise incomes. So political connections can help to acquire both direct and indirect benefits.
So, Russian state-business relations can be characterized as unique and difficult to classify case. While having objectively strong government apparatus, it can't be characterized as crony capitalism or dictatorship in its purer form. In order to make decisions, Russian government must think about the legitimacy of actions and how business community will response to these actions. On the other hand, business in Russia still somehow restricted by the government and the key stakeholders of the biggest enterprises are heavily affiliated with government apparatus. Thus, state-business relations can be mapped somewhere between mutual hostages and strong government relations, according to Kang typology.
On the other hand, in order to observe effects of these interpersonal relations, statistical models, such as regression analysis are commonly used among researchers [see Szakonyi, 2018; Acemoglu et. Al., 2016; Amore, Benedsen, 2013]. Thus, it would be useful to use one of such methods in order to get deeper insight into data. The choice of method, however, would be driven by data.
1.4 Literature review conclusion
In this section the literature overviewwas conducted. The term `damage' was formalized through the definition of property rights and incentives of entrepreneurial activity. It appeared, that the cases are suitable for studying the current topic should be connected with property rights alienation cases by the government actors and strategies, in which entrepreneurs use both legislative and informal structures in order to secure property rights. The case overview of previously held studies helped to define the range of business-government relations that contemporary business-government relations in Russia can be characterized with. The importance of interpersonal connections was highlighted and justified.
Section 2. Methodology
2.1 Research design
As it was mentioned earlier, the main goal of the research is to identify different combinations of factors, that may provide an entrepreneur with the opportunity of minimizing the damage received during the disputes involving state agents on the opposite side. Since past studies have focused mainly on one type of strategies per article (the main focus was on political strategies [see Szakonyi, 2018; Markus, 2012]), the main idea of the paper follows the logic of exploratory research design implying data-driven approach, since it is planned to study factors regardless to their nature. For instance, both easily changeable (such as the purposefully established personal connections and membership in depersonalized organizations) and relatively hard-changeable (the size of the enterprise, industry sphere) factors are in the scope of research.
On the one hand, such formulation simplifies the task of establishing a research design since it removes restrictions about the set of available methods. On the other hand, a number of questions arise regarding the collection and preparation of the final dataset.
First, it is unclear what kind of data should be collected. Previous studies were focused primarily on one type of factors, but when it comes to the process of studying all possible combinations, data collection procedure can be defined as a non-trivial task since it is impossible to parse every single action, such as establishing arrangements (for instance, between the entrepreneur and layers) or joining the organization even for one entrepreneur during the case being studied. Considering that for quantitative analyses hundreds of entrepreneurs' actions logs are required.
Thus, the “black box” framework was applied. The central idea here was to admit that it is impossible to get all the data about every entrepreneur action during the case. However, the information what can be clearly defined in public resources is information about the characteristics of enterprise and the entrepreneur. This data is preliminary to the case. The outcomes of the case can be also clearly defined in form of judicial opinions and publications in mass-media. Thus, it is possible to take all the preliminary characteristics and the final outcomes and see the combinations of characteristics, which “winners” are more likely to have.
Speaking about publicly available data sources, the idea of using Spark Interfax and central election commission of Russian Federation was taken from the previous study [Szakonyi, 2018]. As it was mentioned in literature review section, in paper called “Businesspeople in Elected Office: Identifying Private Benefits from Firm-Level Returns” David Szakonyi use Spark-Interfax database and central election commission database in order to study the relationships between being the winner of the elected office and firm performance in level of revenue and profitability [Szakonyi, 2018]. Since the study was conducted on a topic, which is highly-related with the topic of present paper and the sources proven to have good quality data with the amount of missing data, which still allows to provide the paper with such complicated analysis as quantitative analysis of key economical features of an organization, the data sources was considered to be reliable enough to conduct the exploratory research in accordance with the present research topic. Moreover, Spark-Interfax database also provides information about latest mass-media publications and history of court cases that can be useful in process of case outcome determination process.
Despite having such a big amount of data, as Spark-Interfax and central election commission of Russian Federation databases suggest, more data is needed in order to achieve the goal of present study. Since the aim of the study is to include all factors that influences the chances to “win”, the information about personal affiliations and case outcomes should be enlarged. It was decided to enlarge the database with information taken from mass media publications. No doubts, such data source has several limitations, such as incomplete or distorted data and difficulty of data collecting process. However, mass media resources may serve as effective complement to Spark database, for instance, when it is not possible to find the case outcome among court resolutions, provided by Spark or information about previously held public office by the entrepreneur.
The second feature provided by the exploratory research design was absence of clear recommendations of sampling procedure. Since the scope of the research has a target group of enterprises, which took part in government-business disputes, the information based on which a division of enterprises from a random sample can be made, should be presented in public resources. This is not the case with the group we are interested in. First, since a lot of government-business disputes are frequently connected with the “strong state” model and the abuse of authority, so it is natural that the interested actors want the publicity of these cases is to be as low as possible. Second, not all entrepreneurs use public resources such as court or mass media to solve their problems. It is possible to assume that the indefinite amount of casesmay be resolved on the level of personal agreements between the government and business actors.Thus, the random sampling of enterprises will provide the researcher with database where proportion of enterprises the researcher is looking is more likely close to zero because of lack of information. Instead of random sample, the sample that potentially maximizing the probability of the enterprise or the entrepreneur being a part ofreal-life government-business disputes sample was taken.
The original data was taken from Center of Public Procedures «Business against corruption» dataset of applications received from 2011 to 2016. The database has already been used in number of studies [see Marques et al, 2020; Levina et al, 2016; Yakovlev et al, 2014]. CPP “Business against corruption” is a public organization founded in the 2011 on the basis of “Delovaya Rossiya” association. The “BAC” main goal is to consider entrepreneurs' appeals about raider capture and corruption.
The dataset contained basic case information, including victims' name and position, victims' company name, the characteristic of a case, the existence of court case and «Business against corruption» procedural information, such as review phase and supportive measures provided.
The main strengths of this database as a data source are:
1) The accessibility of “Business against corruption” help. In fact, there are no restriction for the entrepreneurs appeals, implying that no individual characteristics, such as personal connections, the size of the enterprise and political career affect the chances of application being considered. This thesis provides an opportunity to consider created sample to have relatively uniform distribution in sense of applicants' personal characteristics. Other words, in order to be fair, sample should provide data with the variance of enterprise and entrepreneurs characteristics.
2) Instead of picking random sample with an extremely small fraction of the target group enterprises the sample with maximized number of potentially suitable cases are collected. In this case, the probability of obtaining valid data will be maximized and the relationships between variables will be less distorted with invalid cases.
3) The appeal to CPP “Business against corruption” do not guarantee that the outcome of the case will be positive for an entrepreneur. Moreover, the application itself do not guarantee that any amount of help will be provided to the applicant. The pressure on business may occur on absolutely legally basis. This fact removes the restriction of the "bad state", implying that all the cases exists because ofunfair institutional position exploitation of government actors. Instead, a sample with diversified is obtained, regardless to the fact which party of disputed should be treated as a possible “victim”.
The main disadvantage of this data source is that there are no warranties that the final sample will be representable. As it was stated earlier, providing the research with random sample may lead to distortion of existing relationships, since the vast majority of observations will not meet the general sampling requirement (namely, to be participant of government-business disputes). Therefore, a non-random sample based on CPP “Business against corruption data” appear to be the most adequate solution for this research.
2.2 Data collection
Based on conclusions, made in research design section, the dataset was compiled.
The final set of data is based on “Business against corruption” center database, which initially included information about applications received by “BAC” center from 2011 to 2016, including the information about the applicant and case related company, case description and stages of the BAC review procedure the application has passed.
As the unit of the analysis the “entrepreneur - enterprise” combination was chosen. The logic of such decision is rooted in theoretical background of the study. Since both sides of this combination provide unique characteristics for the case, it is in scope of exploratory research to include all possible combinations of observable characteristics. For instance, by recognizing community of entrepreneurs as potential source of “bottom-up” power, we can assume that personal characteristics such as membership in political party can be treated as the source of power businessmen use in order to protect their interest. The opposite is also can be true: the chances of the entrepreneur to avoid criminal prosecution or to save a company might be increased by size of the company (“too big to fail” case) or the industrial sector enterprise is working in.
In initial BAC database each row stands for the distinct application, which implies different “entrepreneur - enterprise” combinations. These combinations, however, were not unique since there is no restrictions about the number of “Business against corruption” application per person for entrepreneurs. In case if the same “entrepreneur - enterprise” combination was found more than in one application, historically first application only was used in the analysis. It is logically correct to do so, because as it was stated earlier, we are interested in preliminary facts and outcomes of available cases. First appeal to such instances as Business Against Corruption center should be treated as a sign of other mechanisms of protection available to the entrepreneurs either do not used or not `strong' enough. In both situations the outcome of the case is unknow at the point of creation of the application and further we can see which combinations lead to the `positive outcome' in terms of entrepreneur and the company.
At the next step Business against corruption base was enlarged by SPARK-Interfax data, which includes information about the main characteristics of the enterprise, such as size, activity type, and court cases. Companies' profiles were matched by the information about the applicants, such as registration region, name of the company and name of the applicant (victim).
Of 1140 applications of initial dataset 482 applications were included in the final version of the database at this stage. Most of excluded observations were removed because of inability to find the exact match in Spark base as the application was either initiated by individual entrepreneur without any company name (multiple intrapreneurs in the same region) or the observation was treated as a repetition for the same “entrepreneur - enterprise” combination.
Information about administrative and political connections was taken from Spark database (service provides mass media search in last 12 months and search by enterprises associated with the person), mass media publications (high-profile cases usually include mentions of political party memberships or administrative positions of an entrepreneur) and “Central Election Commission”.
2.3 Key variables creation and variables selection procedure
As it was stated in research design section, the given sample provides the researcher with variability of cases, including variability of threats to potential victims and variability of potential outcomes. For instance, the situation when company key stakeholders cannot come to an agreement about the company shares distribution propose noticeably different level of threats to the case participant comparing to the situation when a criminal case is initiated against an entrepreneur in order to influence his political position.
Since we cannot formulate the result of the case in form of scale-type variable (because the outcomes are different and obviously do not belong to the same scale), it is possible to fix the fact of “winning” the case or “losing” the case. Thus, the research requires a binary target variable. As it was stated earlier, the outcome of the case still can be different, so creating one target variable will be considered as oversimplification. Since the outcomes of the case differ from case to case and the `success' and `failure' is a relative term, the idea of creation of several binary variables, based on the amount of damage being received, was accepted. Studying the data being parsed, it was decided to use “current status” variable, taken from Spark database. Unlike all other target variables, the status of the enterprise if not relational status, but more objective, so it can be seen regardless the nature of the case being studied.
Following such logic, different target variables were created for testing different thresholds:
1) “Is_working” target variable is a binary variable. 1 stands for enterprise is working in 2020-05-01, 0 - the enterprise was closed before 2020 (any reason: liquidation, bankruptcy, exclusion from unified state register of legal entities (EGRUL)) Has no missing data because Spark database provide all the selected enterprises with information about current status.
2) “Target_light_clear” target variable is a binary variable. It includes only clear cases - only those cases in which a clear outcome was found (publication or court case). In this target variable mild threshold was taken: 1 stands for the cases, in which the businessman did not receive a prison term or the enterprise was closed (at the same time both may be damaged). 0 stands for the other cases.
3) “Target_light_extended” target variable is a binary variable. It repeats the previous target variable, but also includes cases in which the clear outcome has not been found, but the businessmen, considered to be a victim in the application, is still in the managerial apparatus of the company and the company still works. Such cases are treated as 1 (since there might was a damage, but the outcome is still favorable to the businessman). 0 stands for the other cases.
4) “Target_strong_extended” target variable is a binary variable. In this target variable strong threshold was taken: the case considered to be 1 (“win“) if and only if businessman or the enterprise do not received any damage except processual ones. 0 stands for other cases.
In Attachment 4 a table with variables can be found. This table includes the logic of variable being included into the final analysis. The full list of variables with detailed information about data type and creation logic can be seen in Attachment 1.
2.4 Data analysis methods
The choice of data analysis methods followed the general logic of data analysis procedure. All computations were performed with R-studio software.
First, at the step of explorative analysis, such basic methods of data observation as histograms, contingency tables and basic statistics (mean, median) were observed. In order to understand whether continuous variables are normally distributed, Shapiro-Wilk test for normality in combination with QQ-plot were used.
Second, at the step of finding basic relationships in data, Pearson's Chi-squared test was used in order to determine whether there exists the association between categorical variables and target variables. In order to find out whether there exists a difference in groups of continuous variables, partitioned by the target variables, medians were compared using non-parametric Kruskal-Wallis rank sum test (since all included in the analysis variables had non-normal distribution).
Finally, at the modelling step logistic regression was chosen as the main method due to the relatively small dataset size (482 observations) and presence of binary target variables. The logistic regression models were built using the principle of AIC criterion minimization, VIF analysis and the analysis of deviance (ANOVA). R-studio built-in AIC maximization algorithm was applied. At the final step, the decision tree model for “is currently working” target variable was built. The decision tree model was validated by train/test sample performance, based on accuracy as a main prediction quality metric (since both positive and negative classes in “is_working” target variable had equal distribution). During the assessment of a model the specificity and sensitivity of a model were considered. The model scores calibration was performed with F1 score as a basic metricand representing harmonic mean of precision and recall of a model.
Section 3. Data analysis
Final dataset contained 482 observations, including 4 target variables. Each observation stood for unique “entrepreneur - enterprise” combination. These combinations were taken from applications, submitted to Center of Public Procedures “Business against corruption” in order to receive help in cases, which were considered unlawful by the applicants. The whole script of data analysis may be found in Attachment 5. The purpose of this section is to describe the overall logic of the analysis, highlight the main results and to summarize key findings.
3.1 Exploratory data analysis. Feature engineering
The data exploration step was started with missing data exploration.
Figure 2. Data missingness map
As shown in figure 2, most of the data has no missing data (98%). The variables that had missing data are target variables (not all companies have information on the number of employees in Spark database) and authorized capital variable (the same as for the size). The distribution of missing data can be seen in table 1.
Table1
Missing data distribution
Variable name |
Rows with missing data, count |
Rows with missing data, % |
|
N_employees_upperbound |
50 |
10% |
|
Authorized_capital |
16 |
3% |
|
Target_light_clear |
122 |
25% |
|
Target_light_extended |
90 |
19% |
|
Target_strong_extended |
90 |
19% |
The next step was to check target variables and check their distribution.
Table 2
Target variables distribution
Variable name |
Valid observations, count |
Positive class observations (1), % |
Negative class observations (0), % |
|
Is_working |
482 |
47% |
53% |
|
Target_light_clear |
360 |
49% |
51% |
|
Target_light_extended |
392 |
53% |
51% |
|
Target_strong_extended |
392 |
23% |
77% |
As it could be seen from table 2, 3 out of 4 target variables have equally distributed positive and negative classes. “Target_strong_extended” target variable implies stronger threshold of defying positive or negative classes, so the proportions are shifted towards negative class.
`Is_working' target variable, which indicates whether the enterprise is still working in 2020, includes no missing values since Spark database have clear information about current status of most of the enterprises. `Target_light_clear' target variable has least of the data because of both cases, which results were not found and doubtful cases, which were excluded (when there are no data about the outcome of the case but the enterprise is still working and the entrepreneur holds a managing position).
The analysis of continuous variables was provided as the next step of data exploration. While most of the variables in the dataset were categorical, variables of age, size and authorized capital remained in continuous scale.
Age. Based on company registration date, company liquidation date and application creation date three variables denoting the age of the enterprise in different points of time were created:
1) Company age until 2020 - number of years passed from the date of company registration to 2020-05-01 (ignoring the liquidation date). This variable can show the companies age profile in case if all of them survived until the present day.
2) Company age including liquidation - number of years passed from the date of company registration to the present day or to the closing or liquidation date. This variable can show the companies real age profile until the present day.
3) Company age until application date - number of years passed from the date of company registration to the date of application creation. The variable shows the companies age profiles at the moment when the “Business against corruption” applications were created.
The limitation here is that only the last variable, “company age until application”, is applicable for data analysis, because other two variables allow the researcher to `investigate the future' containing information about how long the enterprise lived. However, it was decided to study the distributions of all three variables in case if any interesting results will be found.
The following figures provides the understanding of age variables distributions.
Mean age 17 years | Median age 16.5 years |
Mean age 15.3 years | Median age 15 years |
Mean age 10 years | Median age 9 years
Figure 3. Companies age variables distribution in given point of time
From figure 3 we can see that the age data is most probably non-normally distributed in all three cases (Shapiro-Wilk test for normality also was conducted, for details see attachment 2). Age variables have no outliers.
The next step was to check whether the relationship between the age variable and target variables existed. Other words, whether “company age until the application” variables differs significantly over groups, divided by target variables. The common way to check whether the values of continuous variable differs between groups of categorical variables is to compare means in given groups. However, since the age variables are not normally distributed, non-parametric Kruskal-Wallis rank sum test was chosen as an analogue. This test is based on ranks and comparing medians instead of means, so Kruskal-Wallis rank sum test is not tied to any exact distribution and less sensitive to the outliers in data.
Table3
Results of Kruskal-Wallis tests of age variables by target variables
Age variable name |
Target variable name |
Kruskal-Wallis chi-squared |
Degrees of Freedom |
P-value |
|
Company age until application |
Is working |
6,1061 |
1 |
0,0134 |
|
Company age until application |
Target light clear |
1,4822 |
1 |
0,2234 |
|
Company age until application |
Target light extended |
0,9357 |
1 |
0,3334 |
|
Company age until application |
Target strong extended |
0,0092 |
1 |
0,9234 |
As Kruskal-Wallis tests show, median age of enterprise in the moment of application submission do not differs significantly in groups formed by target variables “Target light clear”, “Target light extended”, “Target strong extended”. In case of “is working” target variable data provides us with enough evidence that group medians are not equal.
Table4
Basic statistics for `company age until application' by currently working status
Variable |
1st quart. |
Median |
Mean |
3rd quart. |
|
Company age until application, closed enterprises |
5.0 |
9.0 |
9.6 |
13.5 |
|
Company age until application, working enterprises |
6.0 |
10.0 |
11.0 |
17.0 |
Thus, from table 5, the mean and median age of companies that still working in 2020 are shifted to higher values. These are just preliminary observations and no conclusions were made at this point. The only conclusion from this part is that `company age until application' variable can be included in the future models with `is_working' target variable in order to estimate how the age of the enterprise influences the survival chances.
Size. The size of the enterprises was originally measured in continuous scale. It based on the upper bound of Spark number of workers interval in closest to application year. The distribution of initially received data looked the following:
Figure 4.The distribution of “company size” variable
As it can be seen from the figure 4, the distribution of this variable is highly skewed. Data also have outliers in it.
Such methods of data transformation as log-transformation and square-transformation have been tried for size variable. However, these methods have showed unsatisfactory results. Following the general logic of data preparation, it was decided to create a categorical analogue of the variable. Since official criteria, which define the size of the enterprise in Russian legislation is a complex criterion, including the number of factors, such as pure income and size["On Small and Medium-Sized Enterprises Development in Russian Federation", 2007], the official criteria of number of employees only were taken in order to categorize the size variable. The distribution of new categorical variable was the following:
Figure 5.Categorical variable ”company size”, missing data excluded
Where “Micro” is organization with the number of employees under 15, “Small” is organization with the number of employees from 16 to 100, “Medium” is organization with the number of employees from 101 to 250 and “Big” is organization with the number of employees over 250.
It is clear from the figure5that the majority of enterprises in the sample have under 15 employees (considered to be “Micro” enterprises). Then comes “Small” enterprises, “Big” enterprises and finally, “Medium” sized enterprises.
However, size data still had missing values. As it was stated in the previous section, we can assume that if Spark database does not have information about the enterprise, then we may assume that the size of the enterprise is not that significant in numbers of employees and we can assign the lowest possible value to these observations. One important criterion here is that the initial data structure should not be distorted.
In order to discover whether the imputation of minimal number of employees will distort the relationships between categorical size variables and target variables, it was decided to refer to contingency tables' analysis (see Attachment 2for more). The review of contingency tables allowed to establish that the imputation of missing data do distort the initial proportions and increase the negative class probability in sample, which could increase type I error (false negative) applying the result model to the real data. Thus, the option of missing data imputation for size variables was denied.
Another feature that could be revealed looking at the figure5, can be uneven distribution of observations over categories of the variable. it was decided to use various combinations of groupings in order to see which groupings can have more explanatory power with different target variables. The final list of size variable included:
1) Division by size without further grouping, including missing data (4-groups).
2) Division by size, opposing Micro enterprises to other categories (as Micro enterprises is the biggest category of all).
3) Division by size, opposing Micro and Small enterprises to other categories (as Micro and Small enterprises may have similar chances for survival).
Authorized capital.The authorized capital variable was treated similar to size variable. The distribution of the variable is also skewed. The data contains outliers.
Figure 6.Distribution of “company authorized capital” variable
Following the general logic of data preparation, it was decided to create categorical `authorized capital' variable. The authorized capital variable was divided by three equal parts in order categorical variables to be as proportional as it possible. The acquired distribution is the following:
Figure 7.Authorized capital categorical variable distribution
The next step was to prepare categorical variables and prevent the appearance of extra small categories, or categories with extremely high or extremely low chances to survive.
The categories enlargement were provided for such variables, as region (grouped by federal districts and closest federal districts), OKVED business activity code (grouped by OKVED largest categories) and maximal business against corruption stage the application passed to (Maximum BAC stages the application has passed variable was grouped by three general steps: Information gathering (preparation for the case discussion), Resolution (the expertise about whether the rights of the entrepreneur were violated) and Council conclusion (in case if case was pushed to the final step - public discussions in BAC public council). See Attachment 1 for variable grouping logic.
3.2 Relationship analysis
In order to check whether any statistically significant relationship exists between categorical variables, pairwise Pearson's Chi-squared test were conducted between each variable and each target variable. See R-code in Attachment 5 for results or Attachment 3for full table with statistics values and p-values.
Before Chi-squares being calculated, the distribution of each variable with each target variable was checked for existing of extremely small groups (5 observations or less) since they can distort the further tests. As a result, a list of variables with extremely small groups was created. Such detailed variables, as “spark_region” or “macro_okved_code” were removed from the further analysis since some sub-groups were not big enough.
In order to systematize the results of Pearson's Chi-squared test, the following table was composed.
Table5
Table of target variable associations with other variables,
according to the Pearson's Chi-squared test
Target variable name |
Associated variables (Pearson's Chi-squared test <= 0.09) |
|
Is_working (Does enterprise work in 2020) |
“Spark web site”; “Category by size micro else”, “Administrative position”; “In political party”; “In association or SRO”; “Capture”; “Barriers”; “Reaction not passed by the applicant”; “To ombudsman”; “Auth capital group”; “Federal district North West”; “Macro OKVED code Financial insurance”;“ Macro OKVED code Other categories”; “Macro OKVED code real estate”; “Macro OKVED code rural”; “Macro OKVED code Trading”; “max_bac_stage.3”; |
|
Target light clear (Businessman was not sent to jail and the enterprise was not closed, excluding controversial cases) |
“Spark web site”; “Category by size micro else”; “Category by size”; “In association or SRO”; “Capture”; “Macro OKVED code Financial insurance”; “Macro OKVED code real estate”; “Macro OKVED code Science”; “max_bac_stage.4”; |
|
Target strong extended (Businessman was not sent to jail and the enterprise was not closed, including controversial cases) |
“Spark web site”; “Category by size micro else”; “Category by size”; “In association or SRO”; “Case publications”; “Capture”; “Macro OKVED code Financial insurance”; “Macro OKVED code real estate”; “Macro OKVED code Science”; “max_bac_stage.4”; |
|
Target strong extended (Businessman and the enterprise do not face any damage except processual spendings) |
“Spark web site”; “category_by_size_2_cat”; “In association or SRO”; “Case publications”; “Capture”; “Barriers”; “Reaction consultation”; “Moscow Region”; “Macro OKVED code Science”; |
Although there exists a variation in associated variables among target variables, it is possible to break down some of the basic associations. First, it looks like no association between federal district variables and target variables have been found: two dummy variables for the federal districts appears randomly in 2 out of 4 targets so it seems that data of the territorial features from this sample do not associate with survival that much. Second, the OKVED business activity code variable have significant relationship with all target variables, which is consistent with the findings of the previous publications about one activity spheres being more dangerous for doing business than others [Kazun, 2015]. Third, the website availability is consistently significant through all four target variables. The support for this fact has not been found in previous studies, so there is no reason to believe that this is not a spurious correlation in data. The idea that the chances to survive may depend on the company size is also supported by the previous findings in this field [Markus, 2012]. The same holds for membership in association and self-regulatory organizations [Yakovlev, 2015]. The characteristics of the case, such as capture, or administrative barriers appear in all four target variables.
At the same time, variables connected with procedural stages in “Business against corruption” appears to be significant only partially.
3.3 Modelling
In order to reveal how each variable may influence the chances to get a `positive' result as an outcome, four logistic regression models have been built. Each model for each target variable. The main idea of building 4 models was to look at the most stable combinations of significant variables, which appear to be common for all models and define how these predictors influence the chances of the entrepreneur to avoid critical damage and enhance the chances of the enterprise survival in the long run.
The algorithm of the analysis was the following. The algorithm was the same for all four models. At the first step the baseline model was built. It included all possible variables from the dataset (excluding those variables that were divided by the target variable into extremely small groups). Second, the model was checked for the multicollinearity by variance inflation factor, which was calculated for regression coefficient bi and showed how each coefficient inflated by the association, which exists between the given predictor and other predictors. After VIF test, ANOVA tests were calculated for revealing how the inclusion of each variable changes the deviance. What ANOVA test do, it performs a likelihood ratio test and, starting by the empty model, begin to add predictors to the model one-by-one. At the each step the difference in variance each predictor contributes to previously defined model (the model with all previously tested predictors) is calculated. On the third step manual variables removal was performed in accordance with AIC criterion and the theoretical justifications. As a result, the improved model was created. At the fourth step, baseline model was automatically upgraded using R built-in “stepAIC” procedure, which iterates through all possible variations of predictors and return the model with such combination of predictors that provide model with the highest AIC possible.
The last step included comparison of all four models and revelation of those predictors, which are significant for all target variables.
The following section intended to describe the final models for each of four target variables and summarize the findings about consistently significant set of predictors.
...Ïîäîáíûå äîêóìåíòû
Õàðàêòåðèñòèêà ïðîãðàììíîé ñðåäû Business Studio 3.6. Äåìîíñòðàöèîííàÿ áàçà íà ïðèìåðå ïîêóïêè è äîñòàâêè îôèñíîé ìåáåëè. Ñîäåðæàíèå, âðåìåííàÿ è ëîãè÷åñêàÿ î÷åðåäíîñòü îïåðàöèé áèçíåñ-ïðîöåññà êîìïàíèè "Àêêîðä" ã. Ðîñòîâ-íà-Äîíó; îáëàñòè ìîäåëèðîâàíèÿ.
êóðñîâàÿ ðàáîòà [1,3 M], äîáàâëåí 01.06.2014Ïðîöåññ èíòåãðàöèè òåõíè÷åñêèõ è ïðîãðàììíûõ ñðåäñòâ âî âñå àñïåêòû äåÿòåëüíîñòè ïðåäïðèÿòèÿ. Ãðóïïà êîíòðîëëèíãà è åå çàäà÷è. Îïåðàöèîííî-îðèåíòèðîâàííûé ðàñ÷åò ñåáåñòîèìîñòè ïðîäóêòà (óñëóãè). Îïðåäåëåíèå âðåìåíè âûïîëíåíèÿ è ñòîèìîñòè ïðîöåññà.
ðåôåðàò [547,5 K], äîáàâëåí 14.09.2010Ìîäåëü îöåíêè äîëãîñðî÷íûõ àêòèâîâ (Capital Asset Pricing Model, ÑÀÐÌ). Îöåíêà äîõîäíîñòè è ðèñêà íà îñíîâå èñòîðè÷åñêèõ äàííûõ. Âûáîð îïòèìàëüíîãî ïîðòôåëÿ èç ðèñêîâàííûõ àêòèâîâ. Ðèñê è íåîïðåäåëåííîñòü äåíåæíûõ ïîòîêîâ. Ðàñ÷åò áåòà-êîýôôèöèåíòà.
ïðåçåíòàöèÿ [104,1 K], äîáàâëåí 30.07.2013Mathematical model of the grinding grating bending process under the action of a meat product load parabolically decreasing along the radius. Determination of the deflection of a knife blade under the action of a parabolic load of the food medium.
ñòàòüÿ [1,3 M], äîáàâëåí 20.10.2022Mission, aims and potential of company. Analysis of the opportunities and threats of international business. Description of the factors that characterize the business opportunities in Finland. The business plan of the penetration to market of Finland.
êóðñîâàÿ ðàáîòà [128,3 K], äîáàâëåí 04.06.2013Definition and stages of business cycles, their causes and the characteristic of kinds. Types and a continuity of business cycles. Kondratyev's wave. A role of cycles in stabilization of a policy of the state. Great depression as an economic crisis.
ðåôåðàò [130,5 K], äîáàâëåí 20.03.2011Business plans are an important test of clarity of thinking and clarity of the business. Reasons for writing a business plan. Market trends and the market niche for product. Business concept, market analysis. Company organization, financial plan.
ðåôåðàò [59,4 K], äîáàâëåí 15.09.2012Support of business entities on the part of specialized agencies of the state on world markets. Interconnection of economic diplomacy of Ukraine in international cooperation with influence on the results of foreign economic activity of the country.
ñòàòüÿ [30,1 K], äîáàâëåí 19.09.2017Impact of globalization on the way organizations conduct their businesses overseas, in the light of increased outsourcing. The strategies adopted by General Electric. Offshore Outsourcing Business Models. Factors for affect the success of the outsourcing.
ðåôåðàò [32,3 K], äîáàâëåí 13.10.2011Technical and economic characteristics of medical institutions. Development of an automation project. Justification of the methods of calculating cost-effectiveness. General information about health and organization safety. Providing electrical safety.
äèïëîìíàÿ ðàáîòà [3,7 M], äîáàâëåí 14.05.2014Business as a combination of types of activities: production, distribution and sale, obtaining economic profit. Basic types and functions of banks. The principle of equilibrium prices and financial management. The use of accounting in the organization.
êîíòðîëüíàÿ ðàáîòà [17,8 K], äîáàâëåí 31.01.2011Prospects for reformation of economic and legal mechanisms of subsoil use in Ukraine. Application of cyclically oriented forecasting: modern approaches to business management. Preconditions and perspectives of Ukrainian energy market development.
ñòàòüÿ [770,0 K], äîáàâëåí 26.05.2015The concept of economic growth and development. Growth factors: extensive, intensive, the growth of the educational and professional level of personnel, improve the management of production. The factors of production: labor, capital and technology.
ïðåçåíòàöèÿ [2,3 M], äîáàâëåí 21.07.2013Entrepreneurial risk: the origins and essence. The classification of business risk. Economic characteristic of entrepreneurial risks an example of joint-stock company "Kazakhtelecom". The basic ways of the risks reduction. Methods for reducing the risks.
êóðñîâàÿ ðàáîòà [374,8 K], äîáàâëåí 07.05.2013The essence, structure, îbjectives and functions of business plan. The process’s essence of the bank’s business plan realization. Sequential decision and early implementation stages of projects. Widely spread mistakes and ways for their improvement.
êóðñîâàÿ ðàáîòà [67,0 K], äîáàâëåí 18.12.2011Description situation of the drugs in the world. Factors and tendencies of development of drugs business. Analysis kinds of drugs, their stages of manufacture and territory of sale. Interrelation of drugs business with other global problems of mankind.
êóðñîâàÿ ðàáîòà [38,9 K], äîáàâëåí 13.09.2010Executive summary. Progect objectives. Keys to success. Progect opportunity. The analysis. Market segmentation. Competitors and competitive advantages. Target market segment strategy. Market trends and growth. The proposition. The business model.
áèçíåñ-ïëàí [2,0 M], äîáàâëåí 20.09.2008Financial position of the "BTA Bank", prospects, business strategy, management plans and objectives. Forward-looking statements, risks, uncertainties and other factors that may cause actual results of operations; strategy and business environment.
ïðåçåíòàöèÿ [510,7 K], äîáàâëåí 17.02.2013Directions of activity of enterprise. The organizational structure of the management. Valuation of fixed and current assets. Analysis of the structure of costs and business income. Proposals to improve the financial and economic situation of the company.
êóðñîâàÿ ðàáîòà [1,3 M], äîáàâëåí 29.10.2014Software as a Service, a form of cloud computing service model of software users. SaaS subscription model: key features, market drivers and constraints. Impact of SaaS subscription services business in the economy and society in Russia and abroad.
äèïëîìíàÿ ðàáîòà [483,8 K], äîáàâëåí 23.10.2016