Review of the parallel hyperquick sort algorithm by C#

Application of sorting to organize large data arrays. Data sorting algorithms. Study of the algorithm for parallel hyperfast sorting of data arrays in the C# programming language. Development of flow graph and program code. Using the test program.

Рубрика Программирование, компьютеры и кибернетика
Вид статья
Язык английский
Дата добавления 08.06.2024
Размер файла 655,6 K

Отправить свою хорошую работу в базу знаний просто. Используйте форму, расположенную ниже

Студенты, аспиранты, молодые ученые, использующие базу знаний в своей учебе и работе, будут вам очень благодарны.

Размещено на http://www.allbest.ru/

Размещено на http://www.allbest.ru/

Review of the parallel hyperquick sort algorithm by C#

Denysiuk Valerii Olexandrovich PhD, assistant professor of Computer Sciences Department, Vinnytsia National Technical University, Vinnytsia,

Abstract

Sorting is used to arrange large arrays of data and present them in a certain structured form. With the development of distributed systems and parallel computing, algorithms were also developed that used the advantages of these technologies to improve and increase the efficiency of the tasks of sorting data arrays. The article discusses the algorithm for parallel hyperquick sorting of data arrays by C# programming language. Hyper quicksort is a type of quicksort algorithm that uses multiple nested items to divide a list into sublists. This allows the algorithm to sort the list faster than the standard quicksort algorithm, The algorithm uses multiple nested elements to divide the list into more than two sublists, which allows the list to be sorted more efficiently, especially when the list is already partially sorted or when it has other special properties. Using multiple aggregates can allow the algorithm to skip large sections of the list that are already in the correct order, instead of recursively sorting them. In the case where the list has some other special property, such as a large number of elements that are equal to the reference part, using multiple reference elements can allow the algorithm to partition the list more efficiently.

The algorithm is an improvement using a simple combination of serial and parallel approaches. Reduces the problem of load balancing. Improves finding the median by sequentially sorting the sublists using a single reference point that is broadcast to all processes at the beginning of the algorithm.

The flow graph and program code of the parallel hyperquick sorting algorithm were developed by C# language.

The algorithm was tested for a large array of data. To test the parallel hyperquick sort algorithm, a program was created that implements the parallel hyperquick sort algorithm and related algorithms, namely: sequential quick sort; naive parallel sorting; optimized parallel sorting; parallel-serial sorting; parallel quicksort by regular sampling. The testing program has the following capabilities: generation of an array with random values according to the size specified by the user; sorting a randomly generated array of data using one of the selected algorithms; measuring the time spent on sorting an array of N elements. The analysis of the obtained data confirms the good time performance of the proposed algorithm for parallel hyperquick sorting of data arrays in comparison with existing sorting algorithms.

Keywords: program code, sorting, algorithm, sorting sublist, parallel algorithm, hyperfast sorting

Анотація

Денисюк Валерій Олександрович кандидат технічних наук, доцент, доцент кафедри комп'ютерних наук, Вінницький національний технічний університет, м. Вінниця

РОЗГЛЯД АЛГОРИТМА ПАРАЛЕЛЬНОГО ГІПЕРШВИДКОГО СОРТУВАННЯ МОВОЮ C#

Для впорядкування великих масивів даних та подання їх в певному структурованому вигляді використовується сортування. З розвитком розподілених систем та паралельних обчислень розвивалися і алгоритми, які використовували переваги даних технологій для вдосконалення і підвищення ефективності задач сортування масивів даних.У статті розглянуто алгоритм паралельного гіпершвидкого сортування масивів даних мовою програмування C#. Гіпершвидке сортування є різновидом алгоритму швидкого сортування, який використовує кілька зведених елементів для поділу списку на підсписки. Це дозволяє алгоритму сортувати список швидше, ніж стандартний алгоритм швидкого сортування, Алгоритм використовує кілька зведених елементів для поділу списку на більше ніж два підсписки, що дозволяє сортувати список ефективніше, особливо коли список вже частково відсортовано або коли він має інші спеціальні властивості. Використання кількох зведених елементів може дозволити алгоритму пропускати великі розділи списку, які вже знаходяться в правильному порядку, замість того, щоб їх рекурсивно сортувати. У випадку, коли список має деякі інші особливі властивості, наприклад, велика кількість елементів, які дорівнюють опорній частині, використання кількох опорних елементів може дозволити алгоритму розділити список більш ефективно.

Алгоритм є вдосконаленням використання простої комбінації послідовного та паралельного підходів. Зменшує проблему балансування навантаження. Покращує знаходження медіани шляхом послідовного сортування підсписків за допомогою однієї опорної точки, яка транслюється усім процесам на початку алгоритму.

Розроблено потоковий граф і програмний код алгоритму паралельного гіпершвидкого сортування мовою прогумування C#.

Проведено тестування алгоритму для великого масиву даних. Для тестування алгоритму паралельного гіпершвидкого сортування створено програму, яка реалізує паралельний алгоритм гіпершвидкого сортування та споріднені алгоритми, а саме: послідовне швидке сортування; наївне паралельне сортування; оптимізоване паралельне сортування; паралельно- послідовне сортування; паралельне швидке сортування шляхом регулярної вибірки. Програма тестування має такі можливості: генерування масиву з випадковими значеннями за розміром заданим користувачем; сортування випадково згенерованого масиву даних за допомогою одного із обраних алгоритмів; вимірювання часу витраченого на сортування масиву з N елементів. Аналіз отриманих даних підтверджує хороші часові показники запропонованого алгоритма паралельного гіпершвидкого сортування масивів даних у порівнянні з існуючимими алгоритмами сортування.

Ключові слова: програмний код, сортування, алгоритм, підсписок сортування, паралельний алгоритм, гіпершвидке сортування

Introduction

algorithm parallel sorting c# programming language

In the era of the development of digital technologies, the most important sciences remain the sciences of working with data: their arrangement, editing, search, organization, etc. Sorting is used to arrange large arrays of data and present them in a certain structured form.

Many different algorithms have been developed for data sorting, which have their own advantages and disadvantages [1,2].

With the development of distributed systems and parallel computing, algorithms were also developed that used the advantages of these technologies to improve and increase efficiency. One of these algorithms is the hyper-fast sorting algorithm. It is parallel by default because it is based on an advanced parallel-serial sort method. In turn, this algorithm consists of a parallel and sequential sorting method, according to the name. There is also an even more innovative technology of parallel sorting with regular sampling, but the efficiency of each of these algorithms will depend on many factors, namely: the size of the array, its filling, the power of the processor, the number of cores and threads.

The main part

Analysis of the subject area

Hyper-quicksort is a type of quicksort algorithm that uses multiple nested elements to divide a list into sublists. This allows the algorithm to sort the list faster than the standard quicksort algorithm, especially for lists that are already partially sorted or have some other special properties [3].

In a standard quicksort algorithm, one aggregate element is selected from the list, and the other elements are split into two sublists based on whether they are smaller than or greater than the aggregate. The anchor point is then placed in the final position and the two sublists are recursively sorted using the same process.

In contrast to the standard quicksort algorithm, a hyperquicksort algorithm is considered, which uses several composite elements to divide a list into more than two sublists. This allows the algorithm to sort the list more efficiently in certain cases, especially when the list is already partially sorted or when it has other special properties.

For example, if the list is already partially sorted, using multiple aggregates can allow the algorithm to skip large sections of the list that are already in the correct order, instead of recursively sorting them. In the case where the list has some other special property, such as a large number of elements that are equal to the reference part, using multiple reference elements can allow the algorithm to partition the list more efficiently.

In general, hyper-quicksort is a useful variant of the quicksort algorithm that can provide improved performance for certain types of lists. However, this is not always the best choice, and the standard quicksort algorithm may be more efficient in some cases.

One of the possible hyperquick sorting algorithms is as follows.

1. If the size of the list (sorting array) is smaller than some threshold value, for example, 8 elements, then it is better to use another sorting algorithm (insertion, merging, etc. [1, 2]).

2. Select several pivot elements from the list.

3. Split the input list into sublists based on the aggregate elements such that all elements in each sublist are less than or equal to the corresponding aggregate element.

4. Sort each sublist using the hyperquick sorting algorithm.

5. Merge the sorted sublists to get the final sorted list.

6. Return the sorted list.

It is proposed to investigate a parallel hyperquick sorting algorithm that uses the number of processes (threads) equal to the number of reference elements.

It is based on the modified parallel quicksort algorithm, which uses the sequential quicksort and optimized parallel quicksort approaches.

Mathematical modeling of the parallel hyperquick sorting algorithm

By default, hyper-quicksort is a parallel algorithm because it uses the sequential quicksort and parallel quicksort algorithms.

In turn, the quick sort algorithm is not parallel by default. There are several ways to make the quicksort algorithm parallel.

First, using a parallel divide-and-conquer strategy where each subtask is solved in parallel. This can be done using techniques such as multithreading or multiprocessing.

Second, using a parallel sort algorithm, such as parallel merge sort, as a variant of the merge sort algorithm. This can be done using methods such as:

• SIMD - one instruction, several data;

• GPGPU - general-purpose computing on graphics processors.

Third, the use of a hybrid approach. Namely: a quick sort algorithm is used to initially split the data, and then a parallel sort algorithm is used to sort the resulting sub-solutions. This will effectively balance the high performance benefits of fastsort in the average case with the parallelization capabilities of other sorting algorithms.

Fig. 1 Flow graph ofparallel sorting (DR - Depth of Recursion, T - Tread

Fig. 1 shows a parallel sort flow graph.

Fig. 2 shows an example of a flow graph for hyperquick sorting of an array of 16 elements.

Fig. 2 A flow graph for hyperquick sorting of an array of 16 elements (DR - Depth of Recursion, T - Tread)

The parallel hyperquick sorting algorithm is as follows.

1. A list of size n is divided among n processes. Assume list of size 16 and 4 processes, each process will handle 4 elements.

2. A process among the four responsible for finding the pivot element, finds a pivot and broadcasts it to all processes which sort their sublists sequentially using the broadcasted pivot element. This step will improve chances of finding pivots close to the true median.

3. We perform the steps:

• Randomly select a pivot element and broadcast it to all partner processes.

• Each process will partition its elements and divide them into two groups according to the selected pivot. group 1 <= pivot <= group2. This happens parallelly across all processes concurrently.

• Each process in upper half of the process list sends its "low list" to a partner process in the lower half of the process list and it receives the "high list" in return.

4. The remaining top half from one partner process and the received top half from the other partner process are merged into local sublist for each process.

5. Recurse the upper half and lower half of each subprocess to achieve a sorted list.

6. Finally merge the processes in order to get a fully sorted list.

Software implementation of the algorithm

Using n threads to sort nelements is not optimal. It is impossible to parallelize the process of dividing the array into subarrays under these conditions. In this case, it is best to use log(n) processes to sort n elements in O(n) time.

The program that implements the parallel sorting algorithm Hyper Quick Sort is implemented as a console application using the C# [4, 5].

The Parallel.Invoke() method allows you to run several subtasks in parallel, but since the work implements recursive HyperQuickSort, you need to limit the depth of parallelization with this method. The best option is to use an additional variable of type integer, which will be equal to the number of GPU cores.

Parallel.Invoke() is a namespace method in the System.Threading.Tasks .NET framework that allows a set of specified methods to be executed simultaneously. It takes an array of delegate methods as input and returns when all methods have completed execution (Fig. 3).

Fig. 3 Example of Parallel.Invoke() use

In the example in Figure 3, the methods Method1(), Method2(), and Method3() will be executed simultaneously. Parallel.Invoke() returns when all methods have finished executing.

It is important to note that Parallel.Invoke() does not guarantee the order in which the methods will be executed. If you want to ensure that methods are executed in a specific order, you should use other synchronization methods, such as Task.WaitAll() or Task.WhenAll().

The Parallel.Invoke method accepts an array of Action objects as a parameter, that is, it is possible to pass to this method a set of methods that will be called upon its execution, and the number of methods may be different. As with the Task class, it is possible to pass either a method name or a lambda expression. Fig. 4 shows the working diagram of the Parallel.Invoke method.

Fig. 4 Operation scheme of the Parallel.Invoke method

Therefore, if there are muliple cores on the target machine, these methods will be executed in parallel on different cores.

Fig. 5 shows an example of the parallel sorting algorithm Hyper Quick Sort implemented by C#.

Testing the Hyper Quick Sort algorithm

To test the parallel hyperquick sorting algorithm, a program was created that implements the parallel hyperquick sorting algorithm and related algorithms [1-3]: Sequential Quicksort; Naive Parallel Quicksort; Optimized Parallel Quicksort; Parallel Sequential Quicksort; Hyper QuickSort; Parallel Quicksort by Regular Sampling.

This program has the following features: generation of an array with random values according to the size specified by the user; sorting of a randomly generated array of data using one of the selected algorithms; measuring the time spent on sorting an array of N elements.

Table 1 provides information on the time to sort a randomly generated data array with the number of elements N = 1,000,000, elements of type integer in the range from 1 to 999.

Therefore, the analysis of the obtained data of table 1 confirms the good time performance of the parallel hyperquick sorting algorithm (the shortest time is 262 ms) in comparison with some existing sorting algorithms.

Conclusions. The investigated parallel hyperquick sort algorithm uses multiple nested elements to divide the list into more than two sublists, which allows the list to be sorted more efficiently, especially when the list is already partially sorted or when it has other special properties.

Fig. 5 Class HyperQuickSort

Sort Type

Time (ms)

Sequential Quicksort

1 278

Naive Parallel Quicksort

950

Optimized Parallel Quicksort

924

Parallel Sequential Quicksort

1 310

Hyper QuickSort

262

Parallel Quicksort by Regular Sampling

30 931

The algorithm is an improvement using a simple combination of serial and parallel approaches. Improves the chances of finding the true median by sequentially sorting the sublists using a single reference point that is broadcast to all processes at the beginning of the algorithm. There is communication overhead when values are passed between partner processes. Reduces the problem of load balancing. Load imbalance can still occur, but the algorithm is better compared to existing ones.

The flow graph and program code of the parallel hyperquick sorting algorithm have been developed.When processing a data array of dimension m, it calculates log (m) steps and n processes, the total time complexity is 0(n log(m)). The complexity of the space is O(log (m)). Testing confirmed the workability of the algorithm and good time characteristics. In future studies, it will be interesting to consider 3D quick sorting.

References

1. D. E. Knuth, The Art of Computer Programming, Vol. 3: Sorting and Searching (3rd. ed.), Addison Wesley Longman Publishing Co., Inc., 1998. 812 p.

2. S. V. Koliadenko, V. O. Denysiuk, N. P. Yurchuk. Dyskretnyi analiz. Chastyna1. Navchalnyi posibnyk. Vinnytsia: VNAU, 2019. 161s. [in Ukrainian].

3. Parallel Quick Sort. Retrieved from https://iq.opengenus.org/parallel-quicksort/

4. C# documentation. Retrieved from https://learn.microsoft.com/en-us/dotnet/csharp/

5. Task Class. Retrieved from https://leam.microsoft.com/en-s/dotnet/api/system.threading. tasks.task?view =net-6.0

Література

1. D. E. Knuth, The Art of Computer Programming, Vol. 3: Sorting and Searching (3rd. ed.), Addison Wesley Longman Publishing Co., Inc., 1998. 812 p.

2. С. В. Коляденко, В. О. Денисюк, Н. П. Юрчук. Дискретний аналіз. Частинаї. Навчальний посібник. Вінниця: ВНАУ, 2019. 161с.

3. Parallel Quick Sort. https://iq.opengenus.org/parallel-quicksort/

4. C# documentation. https://learn.microsoft.com/en-us/dotnet/csharp/

5. Task Class. https://learn.microsoft.com/en-s/dotnet/api/system.threading.tasks.task? view =net-6.0

Размещено на Allbest.ru

...

Подобные документы

  • Lists used by Algorithm No 2. Some examples of the performance of Algorithm No 2. Invention of the program of reading, development of efficient algorithm of the program. Application of the programs to any English texts. The actual users of the algorithm.

    курсовая работа [19,3 K], добавлен 13.01.2010

  • A database is a store where information is kept in an organized way. Data structures consist of pointers, strings, arrays, stacks, static and dynamic data structures. A list is a set of data items stored in some order. Methods of construction of a trees.

    топик [19,0 K], добавлен 29.06.2009

  • Проблемы оценки клиентской базы. Big Data, направления использования. Организация корпоративного хранилища данных. ER-модель для сайта оценки книг на РСУБД DB2. Облачные технологии, поддерживающие рост рынка Big Data в информационных технологиях.

    презентация [3,9 M], добавлен 17.02.2016

  • Data mining, developmental history of data mining and knowledge discovery. Technological elements and methods of data mining. Steps in knowledge discovery. Change and deviation detection. Related disciplines, information retrieval and text extraction.

    доклад [25,3 K], добавлен 16.06.2012

  • Классификация задач DataMining. Создание отчетов и итогов. Возможности Data Miner в Statistica. Задача классификации, кластеризации и регрессии. Средства анализа Statistica Data Miner. Суть задачи поиск ассоциативных правил. Анализ предикторов выживания.

    курсовая работа [3,2 M], добавлен 19.05.2011

  • Basic assumptions and some facts. Algorithm for automatic recognition of verbal and nominal word groups. Lists of markers used by Algorithm No 1. Text sample processed by the algorithm. Examples of hand checking of the performance of the algorithm.

    курсовая работа [22,8 K], добавлен 13.01.2010

  • Описание функциональных возможностей технологии Data Mining как процессов обнаружения неизвестных данных. Изучение систем вывода ассоциативных правил и механизмов нейросетевых алгоритмов. Описание алгоритмов кластеризации и сфер применения Data Mining.

    контрольная работа [208,4 K], добавлен 14.06.2013

  • Совершенствование технологий записи и хранения данных. Специфика современных требований к переработке информационных данных. Концепция шаблонов, отражающих фрагменты многоаспектных взаимоотношений в данных в основе современной технологии Data Mining.

    контрольная работа [565,6 K], добавлен 02.09.2010

  • Creation of the graphic program with Visual Basic and its common interface. The text of program code in programming of Visual Basic language creating in graphics editor. Creation of pictures in Visual Basic, some graphic actions with graphic editor.

    лабораторная работа [1,8 M], добавлен 06.07.2009

  • Review of development of cloud computing. Service models of cloud computing. Deployment models of cloud computing. Technology of virtualization. Algorithm of "Cloudy". Safety and labor protection. Justification of the cost-effectiveness of the project.

    дипломная работа [2,3 M], добавлен 13.05.2015

  • Основы для проведения кластеризации. Использование Data Mining как способа "обнаружения знаний в базах данных". Выбор алгоритмов кластеризации. Получение данных из хранилища базы данных дистанционного практикума. Кластеризация студентов и задач.

    курсовая работа [728,4 K], добавлен 10.07.2017

  • Історія виникнення комерційних додатків для комп'ютеризації повсякденних ділових операцій. Загальні відомості про сховища даних, їх основні характеристики. Класифікація сховищ інформації, компоненти їх архітектури, технології та засоби використання.

    реферат [373,9 K], добавлен 10.09.2014

  • Program of Audio recorder on visual basic. Text of source code for program functions. This code can be used as freeware. View of interface in action, starting position for play and recording files. Setting format in milliseconds and finding position.

    лабораторная работа [87,3 K], добавлен 05.07.2009

  • Description of a program for building routes through sidewalks in Moscow taking into account quality of the road surface. Guidelines of working with maps. Technical requirements for the program, user interface of master. Dispay rated pedestrian areas.

    реферат [3,5 M], добавлен 22.01.2016

  • Роль информации в мире. Теоретические основы анализа Big Data. Задачи, решаемые методами Data Mining. Выбор способа кластеризации и деления объектов на группы. Выявление однородных по местоположению точек. Построение магического квадранта провайдеров.

    дипломная работа [2,5 M], добавлен 01.07.2017

  • Theoretical aspects of the application digital education resources in teaching computer science according to the capabilities of electronic programs. Capabilities of tools Microsoft Office and Macromedia Flash. Application of the program Microsoft Excel.

    контрольная работа [1,5 M], добавлен 07.07.2013

  • Определение программы управления корпоративными данными, ее цели и предпосылки внедрения. Обеспечение качества данных. Использование аналитических инструментов на базе технологий Big Data и Smart Data. Фреймворк управления корпоративными данными.

    курсовая работа [913,0 K], добавлен 24.08.2017

  • Анализ проблем, возникающих при применении методов и алгоритмов кластеризации. Основные алгоритмы разбиения на кластеры. Программа RapidMiner как среда для машинного обучения и анализа данных. Оценка качества кластеризации с помощью методов Data Mining.

    курсовая работа [3,9 M], добавлен 22.10.2012

  • Методика и основные этапы построения модели бизнес-процессов верхнего уровня исследуемого предприятия, его организационной структуры, классификатора. Разработка модели бизнес-процесса в IDEF0 и в нотации процедуры, применением Erwin Data Modeler.

    курсовая работа [1,6 M], добавлен 01.12.2013

  • Program automatic system on visual basic for graiting 3D-Graphics. Text of source code for program functions. Setting the angle and draw the rotation. There are functions for choose the color, finds the normal of each plane, draw lines and other.

    лабораторная работа [352,4 K], добавлен 05.07.2009

Работы в архивах красиво оформлены согласно требованиям ВУЗов и содержат рисунки, диаграммы, формулы и т.д.
PPT, PPTX и PDF-файлы представлены только в архивах.
Рекомендуем скачать работу.