900字范文,内容丰富有趣,生活中的好帮手!
900字范文 > C++ 高性能计算之多线程简单基础入门教程

C++ 高性能计算之多线程简单基础入门教程

时间:2021-04-04 18:31:09

相关推荐

C++ 高性能计算之多线程简单基础入门教程

C/C++ 高性能计算之多线程简单基础入门教程

比起别人的盲目罗列函数接口,鹦鹉学舌式的解释每一个输入参数和输出参数,一味求全而无重点,我的文章更侧重于入门知识的讲解,宁缺毋滥,只有一些最简单的入门用法,先亮代码,让你以最快地速度感受到这个东西原来是这个样子。更深入的知识,可以查看文档以及其他参考。

不管什么东西,都是尽可能地用新不用旧,所以,文章涉及的库和工具,都是基于 C++11 标准的。

前面有相关的博文已经介绍过 MPI,OpenMP 等相关的并行技术,以及 hadoop 生态圈相关的一些知识。今天给大家介绍一下 C/C++(下面统称为 C)中的利用多线程并行或者说并发进行计算的一个简单基础入门教程。

我这篇教程适合的人群为那些没接触过 C 的多线程编程,想要入门的同学,以及那些有程序基础,单纯的想要给自己的程序譬如循环部分,加加速,想要拷贝一份模板进行修改的人。

让我们开始吧。

从一个极简的例子开始

假设你现在写了一段 C 代码,大概长成下面这样。

using namespace std;void fun(para_ref,para){for (int j = 0; j < task_size; j++) {dosomething}}int main(int argc, char *argv[]){fun(para_ref,para);return 0;}

你哇的一下突然发现,你调用的这个函数里面有个 for 循环,这个 for 循环的每一次 j,似乎有没什么关系,各算各的,你也知道自己的电脑有四个核心,总不能浪费资源吧,你开始谋划并行的策略。你老板说不能用 OpenMp 的编译选项,MPI 的编程门槛有很高,然后你想到了似乎在哪里听到过的多线程。

一个最容易想到的就是把 for 循环给均分拆成几个部分,每个线程执行 for 中的一个部分。怎么做呢?非常简单,对于这种无需异步的需求,你像下面这样写就 OK 了。

#include <thread>//使用 c++11 标准的多线程库 thread#define THREAD_NUM 4//定义多线程的个数using namespace std;void fun(para_ref, para, int index_begin, int task_num)//原来的函数,被拆成了一片片,需要添加循环起始索引,和个数{for (int j = index_begin; j < index_begin + task_num; j++){dosomething;}}int main(int argc, char *argv[]){thread threads[THREAD_NUM];//定义4个线程int task_num = (int)(task_size / THREAD_NUM);//计算每个线程负责的任务数for (int i = 0; i < THREAD_NUM - 1; i++){threads[i] = thread(fun, ref(para_ref), para, i*task_num, task_num);//每个线程开始做各自的任务}//因为 task_size 可能不能被平均分成 4 分,最后一份要特别处理threads[THREAD_NUM - 1] = thread(func, ref(para_ref), para, (THREAD_NUM - 1)*task_num, int(task_num + (task_size% THREAD_NUM)));for (auto& it : threads){it.join();//4个线程同步}return 0;}

代码中有较为详尽的解释,仔细看一遍,就知道 C++ 多线程是怎么用了。你可以把它当成也给模板,拷贝到自己的代码中,随便改改就能用了。简单解释一下

threads[i] = thread(fun, ref(para_ref), para, i*task_num, task_num);

这里的 thread 表示开始多线程的调用,其中 fun 是你每个线程要调用的回调函数,回调函数要求必须是静态函数,或者说是全局函数,如果你写在 C++ 的某个 Class 里面,可能需要处理的问题就比较多了,因为静态成员函数只能类的静态成员,牵一发而动全身。

fun 后面紧跟的是 fun 的输入参数,这个没什么好说的,需要注意的是,如果 fun 的输入参数有引用的类型的(C++ 中用 & 符号标识),那么多线程调用就要用 ref 关键字说清楚。如上所示。

任务池的方式实现并行的负载均衡(无需等待的 mutex 互斥锁设计)

上面提到的方法是利用循环变量的均匀划分来实现任务的划分,有个问题在于,这样划分真的是负载均衡的吗?有没有可能三个线程秒算完了,一个线程在还在一直算?这样是不是没有拉满 CPU,并不能达到最大性能?

举例来说,循环前 1/4 的占了 99% 的时间,那么这种多线程的任务分配方式就不尽合理。有人提出除 4 取余,余数相同的用同一个线程来做的方法,能做到负载均衡吗?也是看情况而定,假如刚好某个循环变量点的计算占了大部分时间,那么,4 个线程几乎同时算完的概率就很低。

那要怎么搞?一个基本的想法就是维护一个任务池。就像工厂里面的多人同时作业一样。把所有的任务丢到一个筐里面,每个工人都去筐那里认领任务,做完了手头闲了,接着去领任务,这样能够保证没有空闲劳动力,是最好的剥削工人方式。

这个思路听着好像很有道理,但是其实面临着一个非常严峻的的问题——任务争抢和读写脏数据。什么意思呢?我还是用工人的例子来做个假设。

假设有甲乙丙丁四个工人去图书馆拿书,书都是按 1、2、3、4、5……这种顺序编好号并分区放好按号来取的。那么,这时候最好需要在图书馆入口处放一本本子,来登记工人们取的书的编号,以免不知道取没取,就搞乱了。登记的思路是,比如甲乙丙丁各自拿了1、2、3、4四本书并登记好出去了,乙手脚麻利,很快回来,然后翻看一下本子,发现1、2、3、4都被取走了,这时候我取 5 吧,做了登记,就去图书馆领书了……以此类推。这时候,如果甲乙丙干完一个任务,同时回到了图书馆做登记,就一个本子啊,为了多干点活多挣点钱,他们就会开始抢这个本子,一会儿,他们就打起来了,这样就出问题了……怎么办,一个解决的方法,就是让他们排队登记,谁先踏入图书馆的门,就排前面,正在登记的人的本子“锁起来”,不让后面的人碰他,直到他写完,把锁打开。这个锁在 C++ 中有很多类型,比如说互斥锁(mutex),所谓的互斥,就是我在用,排斥你,你在用,就排斥我。学过数据库的就很清楚。

用互斥锁的思路是可以实现的。但是,新问题来了,让工人们排队等,岂不是白白浪费时间,本来一天可以拿 100 本书,现在排队浪费了时间,一天只能拿 90 本书,血亏啊。一般,对于 C++ 程序,如果对性能有很高的要求的话,多线程加锁血崩。那怎么办呢?别急,我有一个好办法。

就是把所有的任务编号全部写到本子上,工人取走一个任务,划掉一个任务。比如说你干完一个任务 5 回到图书馆了,然后看本子上的任务 6 被划掉了,接着看任务 7 也被划掉了,直到任务 8,还没被划掉,那么你就可以认领并划掉任务 8。这里的划掉这个动作,也可以用加锁来实现。这为什么能解决问题呢?譬如说两个工人同时回到了图书馆,然后,甲看到乙正在划任务 10(被上锁了),他不需要等你划完,他可以接着往下找,找到一个没被划掉的,然后取划掉它。这个过程,就不需要等待。不争,才是最好的状态,对于人生来说,又何尝不是如此。

怎么实现呢?同样,我提供一个写法的模板,看代码,永远是最好的学习程序的方式。

同样的,假如你现在手头有这样一份代码

#include<thread>using namespace std;void run_dfs(){for (int i = 0; i < task_num; ++i){xxxxxxxxxxi}}int main(){run_dfs();exit(0);}

你希望对 run_dfs 里面的 for 循环进行任务池式的并行。那么,你就可以这样子写:

#include<thread>#include<vector>using namespace std;vector<mutex> mtx(task_num);void dfs_thread(int thread_id){int i = thread_id;while (i<task_num){if (mtx[i].try_lock()){xxxxxxxxxxi//do_task(i);}i++;}}void run_dfs(){thread threads[4];for (int thread_id = 0; thread_id < 4; thread_id++){threads[thread_id] = thread(dfs_thread, thread_id);}for (auto& it : threads){it.join();}}int main(){run_dfs();exit(0);}

是不是很简洁?用了锁,却不需要等待。当然,这里的 while if 也是可以改成 for 的。解释一下,这个模板里面没有什么新的东西。一个是声明了一个互斥锁向量 mtx,用来记录某任务是否被上锁(即是否被工人划掉了)。这里只有用到一个方法就是互斥锁的try_lock(),上锁成功返回 true,上锁失败返回 false,如果变量地址已经被锁住了,就会上锁失败,这个判断操作是很省时间的。

如果你的任务开始时有序的,做完任务之后还要求是有序的,即原来是按 1、2 、3……拍的任务,工人做好作品之后,需要按原来的顺序排好,那么在程序中就需要想办法记录每个工人认领了哪些任务号,或者说,每个任务依次是被那个任务认领的。

其他

多线程只要不写同一个内存地址,是不会引起冲突的,比如说四个线程分别写数组int a[4]的四个位置。如果你问我你的程序计算结果相互依赖,依赖关系还很复杂,要怎么设计程序才能最大化 CPU 利用,同步的 wait 和异步的 feature ,了解一下。线程跑完 join 和 detach 怎么理解?顾名思义呗。join 就像一条河流,走到了某个地方,在等待其他河流的加入,如果其他河流不加入(即没运行到这个位置),傲娇的河流就不往前流了。 detach 就是分离,单飞啦,不等你们,就像曹云金离开了郭德纲, detach 了,即使主线程德云社倒闭了,我照样无休止地飞奔下去,一直到东海。

最后,再附几个例子程序吧。

1、

#include <iostream>#include <vector>#include <sstream>#include <fstream>#include <cmath>#include <cstdlib>#include <numeric>#include <string.h>#include <thread>#include <unistd.h>#include <sys/ipc.h>#include <sys/shm.h>#include <sys/wait.h>//用于进程的监听#include <ctime> //要包含的头文件#define TEST#define FEATURE_NUM 1000#define THREAD_NUM 4#define TRAIN_DATA_NUM 8000#define TEST_DATA_NUM 20000using namespace std;//一些全局变量的定义double features[TRAIN_DATA_NUM][FEATURE_NUM];double features_test[TEST_DATA_NUM][FEATURE_NUM];const int MAXS = 8 * (FEATURE_NUM + 1)*TRAIN_DATA_NUM;char buf[MAXS];const int MAXS_TEST = 8 * FEATURE_NUM*TEST_DATA_NUM;char buf_test[MAXS_TEST];int main(int argc, char *argv[]){//******************************函数申明**************************************void loadTrainData2(string trainFile, double(&features)[TRAIN_DATA_NUM][FEATURE_NUM], int(&labels)[TRAIN_DATA_NUM]);void loadTrainData(string trainFile, double(&features)[TRAIN_DATA_NUM][FEATURE_NUM], int(&labels)[TRAIN_DATA_NUM]);void initParam(double wtInitV, double(*wtSet)[FEATURE_NUM]);void train(double(&features)[TRAIN_DATA_NUM][FEATURE_NUM], int(&labels)[TRAIN_DATA_NUM], double(&wtSet)[FEATURE_NUM],double stepSize0, double rho, double c, double epsilon);void loadTestData2(string testFile, double(&features)[TEST_DATA_NUM][FEATURE_NUM]);void loadTestData(string testFile, double(&features)[TEST_DATA_NUM][FEATURE_NUM]);void predict(double(&features_test)[TEST_DATA_NUM][FEATURE_NUM], double(&wtSet)[FEATURE_NUM], string predictOutFile);void loadAnswerData(string awFile, int(&awVec)[TEST_DATA_NUM]);#ifdef TESTdouble ts = clock();#endif//***************给定文件路径****************/* string trainFile = "./data/train_data.txt";string testFile = "./data/test_data2.txt";string predictFile = "./projects/student/result.txt";string answerFile = "./projects/student/answer2.txt"; */string trainFile = "/data/train_data.txt";string testFile = "/data/test_data2.txt";string predictFile = "/projects/student/result.txt";string answerFile = "/projects/student/answer2.txt";//********************程序的多进程处理*********************//创建共享变量和内存块,创建了 ID 为WT_SET的共享内存块,用来放权重int shm;double* wtSet_p;double wtSet[FEATURE_NUM];shm = shmget(IPC_PRIVATE, FEATURE_NUM*sizeof(double), IPC_CREAT | 0644);if (shm < 0){perror("shmget");return 1;}//创建子进程pid_t pid;pid = fork();if (pid == -1){perror("fork");exit(1);}else if (pid == 0) //子进程用来读训练数据和训练{//cout << "I'm parent, pid = " << getpid() << endl;//******************给定训练参数***********const double stepSize0 = 0.5;//初始步长const double wtInitV = 0;//初始权重const int maxIterTimes = 9;//最大迭代步数,不一定用到double rho = 0.5;double c = 0.01;double epsilon = 0.01;//*****************读入训练数据***************cout << "loading train data ..." << endl;#ifdef TESTdouble ts11 = clock();#endifint labels[TRAIN_DATA_NUM];loadTrainData2(trainFile, features, labels);//cout << "size of train data:" << (*trainDataSet_p).size() << endl;//cout << "label of train data:" << (*trainDataSet_p)[10].label << endl;//cout << "feature of train data:" << (*trainDataSet_p)[10].features[10] << endl;//getchar();#ifdef TESTdouble te11 = clock();cout << "loading train data time is " << (double)(te11 - ts11) / CLOCKS_PER_SEC << "s" << endl;#endif//*********************开始训练************************//初始化权重,并开始训练cout << "training ..." << endl;#ifdef TESTdouble ts12 = clock();#endifinitParam(wtInitV, &wtSet);//初始化参数train(features, labels, wtSet, stepSize0, rho, c, epsilon);//存模型//storeModel(*wtSet_p);#ifdef TESTdouble te12 = clock();cout << "train time is " << (double)(te12 - ts12) / CLOCKS_PER_SEC << "s" << endl;#endif//取共享内存空间,写入入共享数据权重wtSet_p = (double*)shmat(shm, NULL, 0);for (int i = 0; i < FEATURE_NUM; i++){*(wtSet_p + i) = wtSet[i];/* cout << wtSet[i]<<endl;getchar(); */}shmdt(wtSet_p);//lost connect#ifdef TESTdouble tes = clock();cout << "son process total time is******************** " << (double)(tes - ts) / CLOCKS_PER_SEC << "s" << endl;#endif}else if (pid > 0) //父进程用来读取测试数据{//cout << "I'm child, pid = " << getpid() << endl;cout << "loading test data ..." << endl;#ifdef TESTdouble ts21 = clock();#endifloadTestData2(testFile, features_test);//载入测试数据#ifdef TESTdouble te21 = clock();cout << "loading test data time is " << (double)(te21 - ts21) / CLOCKS_PER_SEC << "s" << endl;#endif//预测,并保存数据cout << "predicting ..." << endl;wtSet_p = (double*)shmat(shm, NULL, 0);/* if(int(wtSet_p) == -1){perror("shmat");return -1;} */#ifdef TESTdouble tswait = clock();#endifint status;wait(&status);//等待子进程 训练结束for (int i = 0; i < FEATURE_NUM; i++){wtSet[i] = *(wtSet_p + i);//写入数据/* cout << wtSet[i]<<endl;getchar(); */}#ifdef TESTdouble tewait = clock();cout << "waiting time is " << (double)(tewait - tswait) / CLOCKS_PER_SEC << "s" << endl;#endif#ifdef TESTdouble ts22 = clock();#endifpredict(features_test, wtSet, predictFile);#ifdef TESTdouble te22 = clock();cout << "predicting time is " << (double)(te22 - ts22) / CLOCKS_PER_SEC << "s" << endl;#endif#ifdef TESTint answerVec[TEST_DATA_NUM];//载入答案和保存预测值,计算误差率cout << "calculating accuracy ..." << endl;//cout << "loading answer data ..." << endl;loadAnswerData(answerFile, answerVec);//载入标准答案//cout << "after loading answer data ..." << endl;int predictVec[TEST_DATA_NUM];//cout << "loading predict data ..." << endl;loadAnswerData(predictFile, predictVec);int answerVec_size = sizeof(answerVec) / sizeof(int);//std::cout << predictVec.size() << " " << answerVec.size() << endl;int correctCount = 0;//cout << "begin calculating accuracy ..." << endl;for (int j = 0; j < TEST_DATA_NUM; j++){if (j < answerVec_size){if (answerVec[j] == predictVec[j]){correctCount++;}}else{cout << "answer size less than the real predicted value" << endl;}}double accurate = ((double)correctCount) / TEST_DATA_NUM;//cout << "(double)correctCount: " << (double)correctCount << endl;//cout << "(*answerVec_p).size(): " << answerVec.size() << endl;cout << "the prediction accuracy is ******************** " << accurate << endl;double tep = clock();cout << "parent process total time is ******************** " << (double)(tep - ts) / CLOCKS_PER_SEC << "s" << endl;#endif}return 0;}inline double Dot(double(&x)[FEATURE_NUM], double(&y)[FEATURE_NUM]){double temp = 0;int i;for (i = 0; i < FEATURE_NUM; i++){temp = temp + x[i] * y[i];}return temp;}void initParam(double wtInitV, double(*wtSet)[FEATURE_NUM]){int i;for (i = 0; i < FEATURE_NUM; i++){(*wtSet)[i] = wtInitV;}}void loadTrainData(string trainFile, double(&features)[TRAIN_DATA_NUM][FEATURE_NUM], int(&labels)[TRAIN_DATA_NUM]){#ifdef TESTdouble ts111 = clock();#endifFILE *fp;//int tick_end = TRAIN_DATA_NUM*(FEATURE_NUM + 1);fp = fopen(trainFile.c_str(), "r");if (fp == NULL){printf("file error\n");exit(1);}fseek(fp, 0L, SEEK_END);int len = ftell(fp);//自动判断 fread 要读入的量rewind(fp);//算完要读入的量后重置指针const int maxS = len;//先读个 60 Mchar* buf = new char[maxS];len = fread(buf, 1, maxS, fp);//因为ftell回车算的是两个字符,所以这里要重置一下lenfclose(fp);#ifdef TESTdouble te111 = clock();cout << "fread train data time is " << (double)(te111 - ts111) / CLOCKS_PER_SEC << "s" << endl;double ts112 = clock();#endiffor (int i = 0; i <= len; i++){//把换行符号替换掉if (buf[i] == '\n'){buf[i] = ',';}}const char * split = ",";char *p = strtok(buf, split);double temp;int row_ind = 0;int col_ind = 0;int tick = 0;//double ftf;while (p != NULL&&row_ind < TRAIN_DATA_NUM){tick = tick + 1;sscanf(p, "%lf", &temp);if (tick % (FEATURE_NUM + 1) == 0){labels[row_ind] = int(temp);row_ind = row_ind + 1;col_ind = 0;p = strtok(NULL, split);continue;}features[row_ind][col_ind] = temp;col_ind = col_ind + 1;p = strtok(NULL, split);}#ifdef TESTdouble te112 = clock();cout << "split train data time is " << (double)(te112 - ts112) / CLOCKS_PER_SEC << "s" << endl;#endif}//char* buf = new char[MAXS];void loadTrainData2(string trainFile, double(&features)[TRAIN_DATA_NUM][FEATURE_NUM], int(&labels)[TRAIN_DATA_NUM]){#ifdef TESTdouble ts111 = clock();#endifFILE *fp;fp = fopen(trainFile.c_str(), "rb");int len = fread(buf, 1, MAXS, fp);fclose(fp);#ifdef TESTdouble te111 = clock();cout << "fread train data time is " << (double)(te111 - ts111) / CLOCKS_PER_SEC << "s" << endl;double ts112 = clock();#endifbuf[len] = '\0';double position = 1.0;int symbol = 0;int point = 0;int i, col, row = 0;double line[FEATURE_NUM + 2] = { 0 };i = 0;for (char *p = buf; *p && p - buf < len; p++){if (row >= TRAIN_DATA_NUM){break;}if (*p == ','){if (symbol == 1)line[i] = -line[i];i++;point = 0;position = 1.0;symbol = 0;line[i] = 0.0;}else if (*p == '\n'){if (symbol == 1)line[i] = -line[i];int ftf = (int)line[FEATURE_NUM];for (col = 0; col < FEATURE_NUM; col++){features[row][col] = line[col];}labels[row] = ftf;row = row + 1;i = 0;point = 0;position = 1.0;symbol = 0;line[i] = 0.0;}else if (*p == '.'){point = 1;}else if (*p == '-'){symbol = 1;}else{if (point == 0)line[i] = line[i] * 10 + *p - '0';else{position = position*0.1;line[i] += position*(*p - '0');}}}#ifdef TESTdouble te112 = clock();cout << "split train data time is " << (double)(te112 - ts112) / CLOCKS_PER_SEC << "s" << endl;#endif}void wxbVecCalc(double(&features)[TRAIN_DATA_NUM][FEATURE_NUM], double(&w)[FEATURE_NUM], double wxbVec[TRAIN_DATA_NUM]){//just matrix multiplicationdouble temp;int i, j;for (i = 0; i < TRAIN_DATA_NUM; i++){temp = 0;for (j = 0; j < FEATURE_NUM; j++){temp = temp + features[i][j] * w[j];}wxbVec[i] = temp;}}void wxbVecCalc_test(double(&features)[TEST_DATA_NUM][FEATURE_NUM], double(&w)[FEATURE_NUM], double wxbVec[TEST_DATA_NUM]){//just matrix multiplicationdouble temp;int i, j;for (i = 0; i < TEST_DATA_NUM; i++){temp = 0;for (j = 0; j < FEATURE_NUM; j++){temp = temp + features[i][j] * w[j];}wxbVec[i] = temp;}}inline double sigmoidCalc(const double wxb){double expv = exp(-1 * wxb);double expvInv = 1 / (1 + expv);return expvInv;}void sigmoidVecCalc(double(&wxbVec)[TRAIN_DATA_NUM], double(&sigmoidVec)[TRAIN_DATA_NUM]){int i;for (i = 0; i < TRAIN_DATA_NUM; i++){sigmoidVec[i] = sigmoidCalc(wxbVec[i]);}}void sigmoidVecCalc_test(double(&wxbVec)[TEST_DATA_NUM], double(&sigmoidVec)[TEST_DATA_NUM]){int i;for (i = 0; i < TEST_DATA_NUM; i++){sigmoidVec[i] = sigmoidCalc(wxbVec[i]);}}void wrxbmVecCalc(double(&features)[TRAIN_DATA_NUM][FEATURE_NUM], double(&labSigmoidVec)[TRAIN_DATA_NUM], double wrxbmVec[TRAIN_DATA_NUM]){//just reverse matrix multiplicationdouble temp=0.0L;int i, j;for (i = 0; i < FEATURE_NUM; i++){temp = 0.0;for (j = 0; j < TRAIN_DATA_NUM; j++){temp = temp + features[j][i] * labSigmoidVec[j];}wrxbmVec[i] = temp/TRAIN_DATA_NUM;}}void gradientVec(double(&features)[TRAIN_DATA_NUM][FEATURE_NUM], int(&labels)[TRAIN_DATA_NUM], double(&sigmoidVec)[TRAIN_DATA_NUM], double(&gVec)[FEATURE_NUM]){double labSigmoidVec[TRAIN_DATA_NUM];for (int j = 0; j < TRAIN_DATA_NUM; j++){labSigmoidVec[j] = double(labels[j])-sigmoidVec[j];}wrxbmVecCalc(features, labSigmoidVec, gVec);}double gradientSlope(double(&features)[TRAIN_DATA_NUM][FEATURE_NUM], int(&labels)[TRAIN_DATA_NUM], int index, double(&sigmoidVec)[TRAIN_DATA_NUM]){//can still be modifieddouble gsV = 0.0L;int i;double sigv, label;for (i = 0; i < TRAIN_DATA_NUM; i++){sigv = sigmoidVec[i];label = labels[i];gsV += (label - sigv) * (features[i][index]);}// cout<<gsV<<endl;// getchar();gsV = gsV / TRAIN_DATA_NUM;return gsV;}double lossCal(double(&features)[TRAIN_DATA_NUM][FEATURE_NUM], int(&labels)[TRAIN_DATA_NUM], double(&w)[FEATURE_NUM]){double lossV = 0.0L;int i;double wxbVec[TRAIN_DATA_NUM];double sigmoidVec[TRAIN_DATA_NUM];wxbVecCalc(features, w, wxbVec);sigmoidVecCalc(wxbVec, sigmoidVec);for (i = 0; i < TRAIN_DATA_NUM; i++){lossV -= labels[i] * log(sigmoidVec[i]);lossV -= (1 - labels[i]) * log(1 - sigmoidVec[i]);}lossV /= TRAIN_DATA_NUM;return lossV;}void train(double(&features)[TRAIN_DATA_NUM][FEATURE_NUM], int(&labels)[TRAIN_DATA_NUM], double(&wtSet)[FEATURE_NUM],double stepSize0, double rho, double c, double epsilon){double dfk0_norm = 0;double dfk1_norm = 0;double beta;double stepSize;double f1, f2;int i, j;double p[FEATURE_NUM];double dfk0[FEATURE_NUM];double dfk1[FEATURE_NUM];double sigmoidVec[TRAIN_DATA_NUM];double wTry[FEATURE_NUM];double wxbVec[TRAIN_DATA_NUM];//计算 Sigmoid(Aw)wxbVecCalc(features, wtSet, wxbVec);sigmoidVecCalc(wxbVec, sigmoidVec);//cout << "debug 1" << sigmoidVec[10] << endl;//计算 p 和 grad f的normgradientVec(features, labels, sigmoidVec, dfk0);dfk0_norm = Dot(dfk0, dfk0);for (i = 0; i < FEATURE_NUM; i++){p[i] = dfk0[i];}//cout << "debug 2" << p[10] << "dfk0_norm" << dfk0_norm << endl;#ifdef TESTint iter_num = 0;#endif//开始进入算法的循环//for (i = 0; i < maxIterTimes; i++) {while (dfk0_norm > epsilon){#ifdef TESTiter_num = iter_num + 1;#endif//cout << "dfk0_norm:"<<dfk0_norm << endl;stepSize = stepSize0;//用回退方法计算步长///*double lossV = 0.0L;for (i = 0; i < TRAIN_DATA_NUM; i++){lossV -= labels[i] * log(sigmoidVec[i]);lossV -= (1 - labels[i]) * log(1 - sigmoidVec[i]);}lossV /= TRAIN_DATA_NUM;f2 = lossV - c*stepSize*Dot(dfk0, p);//while (true)for (;;){//cout << "stepsize%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%:" << stepSize << endl;//cout<<"stepsize:"<<stepSize<<endl;//计算 w+alpha*p_k//cout<<"stepsize:"<<stepSize<<endl;for (j = 0; j < FEATURE_NUM; j++){wTry[j] = wtSet[j] + stepSize * p[j];}//cout << "debug 3" << wTry[10] << endl;//cout << "debug LL" << wtSet[10] << " " << stepSize << " " << p[10] << endl;f1 = lossCal(features, labels, wTry);//cout << "f1:"<<f1<<endl<<"f2:"<< f2 << endl;//cout<<"debug L"<<lossCal()<<" "<<c<<" "<< stepSize<<" " <<dfk0[10]<<" "<<p[10]<<endl;//cout << "debug 4" << "f1:" << f1 << "f2:" << f2 << endl;if (f1 <= f2 || stepSize < 0.001){break;}stepSize = rho*stepSize;//cout << "stepSize" << stepSize << endl;}//*///stepSize = rho*stepSize0;//先走一步共轭方向步for (j = 0; j < FEATURE_NUM; j++){wtSet[j] += stepSize * p[j];}//cout << "debug 5" << .wtSet[10] << endl;//更新 beta 和 p//算下一步梯度以及梯度模//计算 Sigmoid(Aw)wxbVecCalc(features, wtSet, wxbVec);sigmoidVecCalc(wxbVec, sigmoidVec);//cout<<"param.wtSet"<<param.wtSet[3]<<endl;gradientVec(features, labels, sigmoidVec, dfk1);dfk1_norm = Dot(dfk1, dfk1);//cout << "debug 6:" << endl << sigmoidVec[10] << dfk1[10] << dfk1_norm << endl;//cout<<"dfk1_norm"<<dfk1_norm<<endl;//算 betabeta = dfk1_norm / dfk0_norm;//cout << beta << endl;//cout << "debug 7:" << beta << endl;//cout<<"p"<<p[10]<<endl;//更新一下 pfor (j = 0; j < FEATURE_NUM; j++){p[j] = dfk1[j] + beta*p[j];}//cout << "debug 8:" << p[10] << endl;dfk0_norm = dfk1_norm;for (i = 0; i < FEATURE_NUM; i++){dfk0[i] = dfk1[i];}//cout << "dfk1_norm:"<<dfk1_norm << endl;}#ifdef TESTcout << "train total iter num: " << iter_num << endl;cout << "dfk0_norm: " << dfk0_norm << endl;#endif}void loadAnswerData(string awFile, int(&awVec)[TEST_DATA_NUM]){//vector<int> awVec;int k = 0;ifstream infile(awFile.c_str());if (!infile){cout << "打开答案文件失败" << endl;exit(0);}while (infile){string line;int aw;getline(infile, line);if (line.size() > 0 && k <= TEST_DATA_NUM){stringstream sin(line);sin >> aw;awVec[k] = aw;k = k + 1;}}infile.close();}void loadTestData(string testFile, double(&features)[TEST_DATA_NUM][FEATURE_NUM]){FILE *fp;fp = fopen(testFile.c_str(), "r");if (fp == NULL){printf("file error\n");exit(1);}fseek(fp, 0L, SEEK_END);int len = ftell(fp);//自动判断 fread 要读入的量rewind(fp);//算完要读入的量后重置指针const int maxS = len;//先读个 60 Mchar* buf = new char[maxS];len = fread(buf, 1, maxS, fp);//因为ftell回车算的是两个字符,所以这里要重置一下lenfclose(fp);for (int i = 0; i <= len; i++){//把换行符号替换掉if (buf[i] == '\n'){buf[i] = ',';}}const char * split = ",";char *p = strtok(buf, split);double temp;int row_ind = 0;int col_ind = 0;int tick = 0;//int ftf;//double ftf;while (p != NULL&&row_ind < TEST_DATA_NUM){tick = tick + 1;sscanf(p, "%lf", &temp);features[row_ind][col_ind] = temp;col_ind = col_ind + 1;if (tick % (FEATURE_NUM) == 0){row_ind = row_ind + 1;col_ind = 0;}p = strtok(NULL, split);}}void loadTestData2(string testFile, double(&features)[TEST_DATA_NUM][FEATURE_NUM]){#ifdef TESTdouble ts111 = clock();#endifFILE *fp;fp = fopen(testFile.c_str(), "rb");int len = fread(buf_test, 1, MAXS_TEST, fp);#ifdef TESTdouble te111 = clock();cout << "fread test data time is " << (double)(te111 - ts111) / CLOCKS_PER_SEC << "s" << endl;double ts112 = clock();#endifbuf_test[len] = '\0';double position = 1.0;int symbol = 0;int point = 0;int i, col, row = 0;double line[FEATURE_NUM + 1] = { 0.0 };i = 0;for (char *p = buf_test; *p && p - buf_test < len; p++){if (row >= TEST_DATA_NUM){break;}if (*p == ','){if (symbol == 1)line[i] = -line[i];i++;point = 0;position = 1.0;symbol = 0;line[i] = 0.0;}else if (*p == '\n'){if (symbol == 1)line[i] = -line[i];for (col = 0; col < FEATURE_NUM; col++){features[row][col] = line[col];}row = row + 1;i = 0;point = 0;position = 1.0;symbol = 0;line[i] = 0.0;}else if (*p == '.'){point = 1;}else if (*p == '-'){symbol = 1;}else{if (point == 0)line[i] = line[i] * 10 + *p - '0';else{position = position*0.1;line[i] += position*(*p - '0');}}}fclose(fp);#ifdef TESTdouble te112 = clock();cout << "split test data time is " << (double)(te112 - ts112) / CLOCKS_PER_SEC << "s" << endl;#endif}int storePredict(int(&predict)[TEST_DATA_NUM], string predictOutFile){string line;int i;ofstream fout(predictOutFile.c_str());if (!fout.is_open()){cout << "打开预测结果文件失败" << endl;}for (i = 0; i < TEST_DATA_NUM; i++){fout << predict[i] << endl;}fout.close();return 0;}void predict(double(&features_test)[TEST_DATA_NUM][FEATURE_NUM], double(&wtSet)[FEATURE_NUM], string predictOutFile){double sigVal;int predictVal;int predictVec[TEST_DATA_NUM];double predictTrueThresh = 0.5;double wxbVec[TEST_DATA_NUM];double sVec[TEST_DATA_NUM];wxbVecCalc_test(features_test, wtSet, wxbVec);sigmoidVecCalc_test(wxbVec, sVec);for (int j = 0; j < TEST_DATA_NUM; j++){sigVal = sVec[j];predictVal = sigVal >= predictTrueThresh ? 1 : 0;predictVec[j] = predictVal;}storePredict(predictVec, predictOutFile);}

2、

#include <map>#include <list>#include <set>#include <stack>#include <unordered_map>#include <algorithm>#include <iostream>#include <stdio.h>#include <chrono>#include <vector>#include <array>#include <tuple>#include <chrono>#include <string.h>#include <thread>using namespace std;typedef vector<vector<int>> Graph;const int maxRecN = 280000;const int maxCycleN = 3000000;const int MAXS_R = maxCycleN * 80;char a[MAXS_R];char L3[MAXS_R];char L4[MAXS_R];char L5[MAXS_R];char L6[MAXS_R];char L7[MAXS_R];int ID[maxRecN][2];const int MAXS = maxRecN * 30;char buf[MAXS];int readTestData(string testDataPath, Graph &gTestData, vector<int> &idxMap, \Graph &pathL2SmHead, \Graph &pathL2SmTail){int i = 0, commaN = 0;FILE *fp;fp = fopen(testDataPath.c_str(), "rb");int len = fread(buf, 1, MAXS, fp);buf[len] = '\0';ID[0][0] = 0;ID[0][1] = 0;set<int> vertexSet;unordered_map<int, int> idxMapRev;for (char *p = buf; *p&&p - buf < len; p++){if (*p == '\n'){++i;ID[i][0] = 0;ID[i][1] = 0;commaN = 0;}else if (*p == ','){vertexSet.insert(ID[i][commaN]);commaN++;}else if (commaN < 2){ID[i][commaN] = ID[i][commaN] * 10 + *p - '0';}}int index = 0;for (auto &it : vertexSet){idxMap.emplace_back(it);idxMapRev.insert(make_pair(it, index));++index;}int from, to;gTestData.resize(index, vector<int>());pathL2SmHead.resize(index, vector<int>());pathL2SmTail.resize(index, vector<int>());for (int j = 0; j < i; ++j){from = idxMapRev[ID[j][0]];to = idxMapRev[ID[j][1]];if (from != to)//no loop,some off line data have loop{gTestData[from].emplace_back(to);if (from < to)pathL2SmHead[from].emplace_back(to);elsepathL2SmTail[from].emplace_back(to);}}for (auto &it : gTestData)sort(it.begin(), it.end());//sort for arranged output resultfor (auto &it : pathL2SmHead)sort(it.begin(), it.end());//sort for arranged output resultfor (auto &it : pathL2SmTail)sort(it.begin(), it.end());//sort for arranged output resultreturn i;}int getDigits(int v) {if (v < 10) return 1;if (v < 100) return 2;if (v < 1000) return 3;if (v < 1000000000000) { // 10^12if (v < 100000000) { // 10^7if (v < 1000000) { // 10^6if (v < 10000) return 4;return 5 + (v >= 100000); // 10^5}return 7 + (v >= 10000000); // 10^7}if (v < 10000000000) { // 10^10return 9 + (v >= 1000000000); // 10^9}return 11 + (v >= 100000000000); // 10^11}return 12 + getDigits(v / 1000000000000); // 10^12}int intToAscii(int value, char* dst) {static const char digits[] = {"0001020304050607080910111213141516171819""222324252627282930313233343536373839""4041424344454647484950515253545556575859""6061626364656667686970717273747576777879""8081828384858687888990919293949596979899" };const int length = getDigits(value);int next = length - 1;while (value >= 100) {const int i = (value % 100) * 2;value /= 100;dst[next - 1] = digits[i];dst[next] = digits[i + 1];next -= 2;}// Handle last 1-2 digitsif (value < 10) {dst[next] = '0' + int(value);}else {int i = int(value) * 2;dst[next - 1] = digits[i];dst[next] = digits[i + 1];}return length;}void arr2Ascii3(array<int, 3> &arr, char* &p, vector<int> &idxMap){int tmp;int dig;dig = intToAscii(idxMap[arr[0]], p);p += dig;for (int i = 1; i < 3; ++i){*p++ = ',';dig = intToAscii(idxMap[arr[i]], p);p += dig;}*p++ = '\n';}void arr2Ascii4(array<int, 4> &arr, char* &p, vector<int> &idxMap){int tmp;int dig;dig = intToAscii(idxMap[arr[0]], p);p += dig;for (int i = 1; i < 4; ++i){*p++ = ',';dig = intToAscii(idxMap[arr[i]], p);p += dig;}*p++ = '\n';}void arr2Ascii5(array<int, 5> &arr, char* &p, vector<int> &idxMap){int tmp;int dig;dig = intToAscii(idxMap[arr[0]], p);p += dig;for (int i = 1; i < 5; ++i){*p++ = ',';dig = intToAscii(idxMap[arr[i]], p);p += dig;}*p++ = '\n';}void arr2Ascii6(array<int, 6> &arr, char* &p, vector<int> &idxMap){int tmp;int dig;dig = intToAscii(idxMap[arr[0]], p);p += dig;for (int i = 1; i < 6; ++i){*p++ = ',';dig = intToAscii(idxMap[arr[i]], p);p += dig;}*p++ = '\n';}void arr2Ascii7(array<int, 7> &arr, char* &p, vector<int> &idxMap){int tmp;int dig;dig = intToAscii(idxMap[arr[0]], p);p += dig;for (int i = 1; i < 7; ++i){*p++ = ',';dig = intToAscii(idxMap[arr[i]], p);p += dig;}*p++ = '\n';}void writeResultData(string resultDataPath, char* &p3, char* &p4, char* &p5, char* p6, char* p7, int cycleCount){int sizeL3 = p3 - L3;int sizeL4 = p4 - L4;int sizeL5 = p5 - L5;int sizeL6 = p6 - L6;int sizeL7 = p7 - L7;int totalSize = sizeL3 + sizeL4 + sizeL5 + sizeL6 + sizeL7;memcpy(p3, L4, sizeL4);p3 += sizeL4;memcpy(p3, L5, sizeL5);p3 += sizeL5;memcpy(p3, L6, sizeL6);p3 += sizeL6;memcpy(p3, L7, sizeL7);#ifdef TESTauto t1 = chrono::steady_clock::now();#endif//string L3Path = "./L3.txt";FILE *pL3 = fopen(resultDataPath.c_str(), "w+");fprintf(pL3, "%d\n", cycleCount);//fprintf(pL3, "%s", L3);fwrite(L3, 1, totalSize, pL3);fclose(pL3);#ifdef TESTauto t2 = chrono::steady_clock::now();cout << "fwrite time: " << chrono::duration_cast<chrono::duration<double>>(t2 - t1).count() << " s" << endl;#endif}void L3whole(Graph &gTestData, int &vtxNum, array<int, 2> arrL2, vector<vector<array<int, 2>>> &pathL3){for (int h = 0; h < vtxNum; h++){for (auto &h0 : gTestData[h]){for (auto &t0 : gTestData[h0]){if (h != t0){arrL2[0] = h0;arrL2[1] = t0;pathL3[h].emplace_back(arrL2);}}}}}void L3head(Graph &pathL2SmHead, Graph &gTestData, int &vtxNum, array<int, 2> arrL2, vector<vector<array<int, 2>>> &pathL3SmHead){for (int h = 0; h < vtxNum; h++){for (auto &h0 : pathL2SmHead[h]){for (auto &t0 : gTestData[h0]){if (h < t0){arrL2[0] = h0;arrL2[1] = t0;pathL3SmHead[h].emplace_back(arrL2);}}}}}void L3tail(Graph &gTestData, Graph &pathL2SmTail, int &vtxNum, array<int, 2> arrL2, vector<vector<array<int, 2>>> &pathL3SmTail){for (int h = 0; h < vtxNum; h++){for (auto &h0 : gTestData[h]){for (auto &t0 : pathL2SmTail[h0]){if (t0 < h){arrL2[0] = h0;arrL2[1] = t0;pathL3SmTail[h].emplace_back(arrL2);}}}}}void L3tailIdx(Graph &gTestData, Graph &pathL2SmTail, int &vtxNum, array<int, 2> arrL2, vector<vector<array<int, 2>>> &pathL3SmTailIdx){for (int h = 0; h < vtxNum; h++){for (auto &h0 : gTestData[h]){for (auto &t0 : pathL2SmTail[h0]){if (t0 < h){arrL2[0] = h;arrL2[1] = h0;pathL3SmTailIdx[t0].emplace_back(arrL2);}}}}}void L4head(vector<vector<array<int, 2>>> &pathL3SmHead, Graph &gTestData, int &vtxNum, array<int, 3> arrL3, vector<vector<array<int, 3>>> &pathL4SmHead){for (int h = 0; h < vtxNum; ++h){for (auto &it : pathL3SmHead[h]){//int h0 = it[0];//int h1 = it[1];for (auto &ite : gTestData[it[1]]){int t0 = ite;if (h < t0&&it[0] != t0)//no self loop{arrL3[0] = it[0];arrL3[1] = it[1];arrL3[2] = t0;pathL4SmHead[h].emplace_back(arrL3);}}}}}void L4tailIdx(Graph &gTestData, vector<vector<array<int, 2>>> &pathL3SmTail, int &vtxNum, array<int, 3> arrL3, vector<vector<array<int, 3>>> &pathL4SmTailIdx){for (int h = 0; h < vtxNum; ++h){for (auto &it : gTestData[h]){//int h0 = it;for (auto &ite : pathL3SmTail[it]){//t0 = ite[0];//t1 = ite[1];if (h != ite[0] && ite[1] < h)//no self loop and last one is smallest{arrL3[0] = h;arrL3[1] = it;arrL3[2] = ite[0];pathL4SmTailIdx[ite[1]].emplace_back(arrL3);}}}}}void L3cycle(Graph &pathL2SmHead, vector<vector<array<int, 2>>> &pathL3SmTailIdx, int &vtxNum, array<int, 3> arrL3, char* &p3, vector<int> &idxMap, int &cycleCount){for (int h = 0; h < vtxNum; ++h){for (auto &it : pathL2SmHead[h]){//h0 = it;for (auto &ite : pathL3SmTailIdx[h]){//t0 = ite[0];//t1 = ite[1];if (it == ite[0])//no self loop{arrL3[0] = h;arrL3[1] = it;arrL3[2] = ite[1];arr2Ascii3(arrL3, p3, idxMap);cycleCount++;}}}}}void L5tailIdx(vector<vector<array<int, 2>>> &pathL3, vector<vector<array<int, 2>>> &pathL3SmTail, int &vtxNum, array<int, 4> arrL4, vector<vector<array<int, 4>>> &pathL5SmTailIdx){for (int h = 0; h < vtxNum; ++h){for (auto &it : pathL3[h]){int h0 = it[0];int h1 = it[1];for (auto &ite : pathL3SmTail[h1]){//t0 = ite[0];//t1 = ite[1];if (ite[1] < h&&ite[1] < h0&&h != ite[0] && h0 != ite[0])//small tail and no self loop{arrL4[0] = h;arrL4[1] = h0;arrL4[2] = h1;arrL4[3] = ite[0];pathL5SmTailIdx[ite[1]].emplace_back(arrL4);}}}}}void L4cycle(vector<vector<array<int, 2>>> &pathL3SmHead, vector<vector<array<int, 2>>> &pathL3SmTailIdx, int &vtxNum, array<int, 4> arrL4, char* &p4, vector<int> &idxMap, int &cycleCount){for (int h = 0; h < vtxNum; ++h){for (auto &it : pathL3SmHead[h]){int h0 = it[0];int h1 = it[1];for (auto &ite : pathL3SmTailIdx[h]){//t0 = ite[0];//t1 = ite[1];if (ite[0] == h1&&h0 != ite[1])//small tail and no self loop{arrL4[0] = h;arrL4[1] = h0;arrL4[2] = h1;arrL4[3] = ite[1];arr2Ascii4(arrL4, p4, idxMap);cycleCount++;}}}}}void L5cycle(vector<vector<array<int, 3>>> &pathL4SmHead, vector<vector<array<int, 2>>> &pathL3SmTailIdx, int &vtxNum, array<int, 5> arrL5, char* &p5, vector<int> &idxMap, int &cycleCount){for (int h = 0; h < vtxNum; ++h){for (auto &it : pathL4SmHead[h]){int h0 = it[0];int h1 = it[1];int h2 = it[2];for (auto &ite : pathL3SmTailIdx[h]){//t0 = ite[0];//t1 = ite[1];if (h2 == ite[0] && h0 != ite[1] && h1 != ite[1])//cycle and no self loop{arrL5[0] = h;arrL5[1] = h0;arrL5[2] = h1;arrL5[3] = h2;arrL5[4] = ite[1];arr2Ascii5(arrL5, p5, idxMap);cycleCount++;}}}}}void L6cycle(vector<vector<array<int, 3>>> &pathL4SmHead, vector<vector<array<int, 3>>> &pathL4SmTailIdx, int &vtxNum, array<int, 6> arrL6, char* &p6, vector<int> &idxMap, int &cycleCount){for (int h = 0; h < vtxNum; ++h){for (auto &it : pathL4SmHead[h]){int h0 = it[0];int h1 = it[1];int h2 = it[2];for (auto &ite : pathL4SmTailIdx[h]){//t0 = ite[0];//t1 = ite[1];//t2 = ite[2];if (h2 == ite[0] && h0 != ite[1] && h0 != ite[2] && h1 != ite[1] && h1 != ite[2]){arrL6[0] = h;arrL6[1] = h0;arrL6[2] = h1;arrL6[3] = h2;arrL6[4] = ite[1];arrL6[5] = ite[2];arr2Ascii6(arrL6, p6, idxMap);cycleCount++;}}}}}void L7cycle(vector<vector<array<int, 3>>> &pathL4SmHead, vector<vector<array<int, 4>>> &pathL5SmTailIdx, int &vtxNum, array<int, 7> arrL7, char* &p7, vector<int> &idxMap, int &cycleCount){for (int h = 0; h < vtxNum; ++h){for (auto &it : pathL4SmHead[h]){int h0 = it[0];int h1 = it[1];int h2 = it[2];for (auto &ite : pathL5SmTailIdx[h]){//t0 = ite[0];//t1 = ite[1];//t2 = ite[2];//t3 = ite[3];if (!(h2^ite[0])\&&h0 != ite[1] && h1 != ite[1]\&&h0 != ite[2] && h1 != ite[2]\&&h0 != ite[3] && h1 != ite[3]\){arrL7[0] = h;arrL7[1] = h0;arrL7[2] = h1;arrL7[3] = h2;arrL7[4] = ite[1];arrL7[5] = ite[2];arrL7[6] = ite[3];arr2Ascii7(arrL7, p7, idxMap);cycleCount++;}}}}}int main(){#ifdef TESTauto s1 = chrono::steady_clock::now();#endifstring testDataPath = "/data/test_data.txt";string resultDataPath = "/projects/student/result.txt";#ifdef TESTtestDataPath = "./data/1004812/test_data.txt";resultDataPath = "./data/myresult.txt";#endifchar* p3 = L3;char* p4 = L4;char* p5 = L5;char* p6 = L6;char* p7 = L7;int cycleCount = 0;int cycleCount3 = 0;int cycleCount4 = 0;int cycleCount5 = 0;int cycleCount6 = 0;int cycleCount7 = 0;Graph gTestData, pathL2SmHead, pathL2SmTail;vector<int> idxMap;int edgeNum = readTestData(testDataPath, gTestData, idxMap, pathL2SmHead, pathL2SmTail);int vtxNum = idxMap.size();vector<vector<array<int, 2>>> pathL3(vtxNum);vector<vector<array<int, 2>>> pathL3SmHead(vtxNum);vector<vector<array<int, 2>>> pathL3SmTail(vtxNum);vector<vector<array<int, 2>>> pathL3SmTailIdx(vtxNum);vector<vector<array<int, 3>>> pathL4(vtxNum);vector<vector<array<int, 3>>> pathL4SmHead(vtxNum);vector<vector<array<int, 3>>> pathL4SmTailIdx(vtxNum);vector<vector<array<int, 4>>> pathL5(vtxNum);vector<vector<array<int, 4>>> pathL5SmTailIdx(vtxNum);array<int, 2> arrL2;array<int, 3> arrL3;array<int, 4> arrL4;array<int, 5> arrL5;array<int, 6> arrL6;array<int, 7> arrL7;thread threads[4];//3whole=2whole+2whole//L3whole(gTestData, vtxNum, arrL2, pathL3);threads[0] = thread(L3whole, ref(gTestData), ref(vtxNum),(arrL2),ref(pathL3));//threads[0].join();//3head=2head+2whole//L3head(pathL2SmHead, gTestData, vtxNum, arrL2, pathL3SmHead);threads[1] = thread(L3head, ref(pathL2SmHead), ref(gTestData),ref(vtxNum),(arrL2),ref(pathL3SmHead));//3tail=2whole+2tail//L3tailIdx(gTestData, pathL2SmTail, vtxNum, arrL2, pathL3SmTailIdx);threads[2] = thread(L3tailIdx, ref(gTestData), ref(pathL2SmTail),ref(vtxNum),(arrL2),ref(pathL3SmTailIdx));//L3tail(gTestData, pathL2SmTail, vtxNum, arrL2, pathL3SmTail);threads[3] = thread(L3tail, ref(gTestData), ref(pathL2SmTail),ref(vtxNum),(arrL2),ref(pathL3SmTail));for (auto& it : threads) {it.join();}thread threads2[5];//4head=3head+2whole//L4head(pathL3SmHead, gTestData, vtxNum, arrL3, pathL4SmHead);threads2[0] = thread(L4head, ref(pathL3SmHead), ref(gTestData),ref(vtxNum),(arrL3),ref(pathL4SmHead));//4tail=2whole+3tail//L4tailIdx(gTestData, pathL3SmTail, vtxNum, arrL3, pathL4SmTailIdx);threads2[1] = thread(L4tailIdx, ref(gTestData), ref(pathL3SmTail),ref(vtxNum),(arrL3),ref(pathL4SmTailIdx));//3cycle=3head+2whole//L3cycle(pathL2SmHead, pathL3SmTailIdx, vtxNum, arrL3, p3, idxMap, cycleCount);threads2[2] = thread(L3cycle, ref(pathL2SmHead), ref(pathL3SmTailIdx),ref(vtxNum),(arrL3),ref(p3),ref(idxMap),ref(cycleCount3));//5tail=3whole+3tail//L5tailIdx(pathL3, pathL3SmTail, vtxNum, arrL4, pathL5SmTailIdx);threads2[3] = thread(L5tailIdx, ref(pathL3), ref(pathL3SmTail),ref(vtxNum),(arrL4),ref(pathL5SmTailIdx));//4cycle=3head+3tail//L4cycle(pathL3SmHead, pathL3SmTailIdx, vtxNum, arrL4, p4, idxMap, cycleCount);threads2[4] = thread(L4cycle, ref(pathL3SmHead), ref(pathL3SmTailIdx),ref(vtxNum),(arrL4),ref(p4),ref(idxMap),ref(cycleCount4));for (auto& it : threads2) {it.join();}thread threads3[3];//5cycle=4head+3tail//L5cycle(pathL4SmHead, pathL3SmTailIdx, vtxNum, arrL5, p5, idxMap, cycleCount);threads3[0] = thread(L5cycle, ref(pathL4SmHead), ref(pathL3SmTailIdx),ref(vtxNum),(arrL5),ref(p5),ref(idxMap),ref(cycleCount5));//6cycle=4head+4tail //L6cycle(pathL4SmHead, pathL4SmTailIdx, vtxNum, arrL6, p6, idxMap, cycleCount);threads3[1] = thread(L6cycle, ref(pathL4SmHead), ref(pathL4SmTailIdx),ref(vtxNum),(arrL6),ref(p6),ref(idxMap),ref(cycleCount6));//7cycle=4head+5tail//L7cycle(pathL4SmHead, pathL5SmTailIdx, vtxNum, arrL7, p7, idxMap, cycleCount);threads3[2] = thread(L7cycle, ref(pathL4SmHead), ref(pathL5SmTailIdx),ref(vtxNum),(arrL7),ref(p7),ref(idxMap),ref(cycleCount7));for (auto& it : threads3) {it.join();}cycleCount = cycleCount3+cycleCount4+cycleCount5+cycleCount6+cycleCount7;writeResultData(resultDataPath, p3, p4, p5, p6, p7, cycleCount);#ifdef TESTcout << "cycle count: " << cycleCount << endl;#endifreturn 0;}

3、

//多线程实现#include <iostream>#include <algorithm>#include <unordered_map>#include <string.h>#include <vector>#include<chrono>#include<mutex>#include<thread>#include <sys/mman.h>#include <unistd.h>#include <sys/types.h>#include <sys/stat.h>#include <fcntl.h>//#define TEST#define is_digit(c) (c >= '0')using namespace std;const int MAX_REC = 2000000;const int MAX_CYCLE_COUNT = 20000000;const int MAX_ID_COUNT = 500000;//maximum:MAX_REC<<2const int MAX_DEG = 300;//maximum:MAX_ID_COUNT-1const int MAX_TEST_DATA_SIZE = MAX_REC * 33;//maximum: MAX_REC * 33const int MAX_CYCLE3_CHAR_SIZE = MAX_CYCLE_COUNT*0.25 * 33;//maximum: MAX_CYCLE_COUNT * 33const int MAX_CYCLE4_CHAR_SIZE = MAX_CYCLE_COUNT*0.25 * 44;//maximum: MAX_CYCLE_COUNT * 44const int MAX_CYCLE5_CHAR_SIZE = MAX_CYCLE_COUNT*0.5 * 55;//maximum: MAX_CYCLE_COUNT * 55const int MAX_CYCLE6_CHAR_SIZE = MAX_CYCLE_COUNT*0.8 * 66;//maximum: MAX_CYCLE_COUNT * 66const int MAX_CYCLE7_CHAR_SIZE = MAX_CYCLE_COUNT*0.8 * 77;//maximum: MAX_CYCLE_COUNT * 77const int MAX_CYCLE_CHAR_SIZE = MAX_CYCLE_COUNT * 77;//maximum: MAX_CYCLE_COUNT * 77struct Origin_Id{int id;char* pos;int len;Origin_Id() :id(), pos(), len(){};Origin_Id(int id_, char* pos_, int len_) :id(id_), pos(pos_), len(len_){};};string test_data_path = "/data/test_data.txt";string result_data_path = "/projects/student/result.txt";//char file_ptr[MAX_TEST_DATA_SIZE];//int Rec[MAX_REC][3];int(*Rec)[3];//Origin_Id id_set[MAX_REC];Origin_Id* id_set;int rec_count = 0;int id_count = 0;void read_raw_data(){char* file_ptr = new char[MAX_TEST_DATA_SIZE];id_set = new Origin_Id[MAX_REC * 2];Rec = new int[MAX_REC][3];FILE *fp;fp = fopen(test_data_path.c_str(), "rb");int data_sz = fread(file_ptr, 1, MAX_TEST_DATA_SIZE, fp);char* buf_end = file_ptr + data_sz;char* buf = file_ptr;int comma_count = 0;int cur_num = 0;int id_len = 0;char* id_pos = buf;int comma_num = 0;while (buf < buf_end){switch (*buf){case '\n'://reset counter{rec_count++;Rec[rec_count][0] = 0;Rec[rec_count][1] = 0;Rec[rec_count][2] = 0;comma_count = 0;cur_num = 0;id_len = 0;id_pos = buf + 1;break;}case ',':{Origin_Id id_tmp(cur_num, id_pos, id_len);id_set[id_count++] = id_tmp;comma_count++;cur_num = 0;id_len = 0;id_pos = buf + 1;break;}default:{if (is_digit(*buf)){cur_num = cur_num * 10 + *buf - '0';Rec[rec_count][comma_count] = cur_num;id_len++;}}}buf++;}}bool id_cmp(Origin_Id a, Origin_Id b){return a.id<b.id;}bool id_unq(Origin_Id a, Origin_Id b){return a.id == b.id;}int id_num;void id_map(unordered_map<int, int> &id_map_rev){//unordered_map<int, int> id_map_rev;sort(id_set, id_set + id_count, id_cmp);id_num = unique(id_set, id_set + id_count, id_unq) - id_set;//now id_set is id_mapfor (int i = 0; i < id_num; ++i){id_map_rev.insert({ id_set[i].id, i });}}struct Edge{int id;int money;Edge() :id(), money(){};Edge(int id_, int money_) :id(id_), money(money_){};}__attribute__((packed));//__attribute__((aligned(x))),vector<vector<Edge>> Graph(MAX_ID_COUNT);vector<vector<Edge>> ReGraph(MAX_ID_COUNT);//Edge Graph[MAX_ID_COUNT][MAX_DEG]; //Edge ReGraph[MAX_ID_COUNT][MAX_DEG];int deg_G[MAX_ID_COUNT];int deg_RG[MAX_ID_COUNT];bool id_cmp_edge(Edge a, Edge b){return a.id<b.id;}void buid_graph(){unordered_map<int, int> id_map_rev;id_map(id_map_rev);int id_from;int id_to;int money;for (int i = 0; i < rec_count; i++){id_from = id_map_rev[Rec[i][0]];id_to = id_map_rev[Rec[i][1]];if (id_from == id_to) continue;money = Rec[i][2];Edge edge_tmp(id_to, money);/* if (money > 715827882){edge_tmp.money3 = 2147483647;}if (money > 429496729){edge_tmp.money5 = 2147483647;} */Graph[id_from].emplace_back(edge_tmp);edge_tmp.id = id_from;ReGraph[id_to].emplace_back(edge_tmp);deg_G[id_from]++;deg_RG[id_to]++;}for (int i = 0; i<id_num; i++){sort(Graph[i].begin(), Graph[i].begin() + deg_G[i], id_cmp_edge);//sort(ReGraph[i].begin(), ReGraph[i].begin() + deg_RG[i], id_cmp_edge);}}inline bool judgeMoneyRe(double money_3,long long int money5,int money){return (money_3<=money)&&(money<=money5);}inline int mtp3(int momey){int money3;if(momey<=715827882){money3 = momey*3;}else{money3 = 2147483647;}return money3;}inline int mtp5(int momey){int money5;if(momey<=429496729){money5 = momey*5;}else{money5 = 2147483647;}return money5;}bool DFSInv(int &minCur, \bool(&toMinCurL1)[MAX_ID_COUNT], \bool(&toMinCurL2)[MAX_ID_COUNT], \bool(&toMinCurL3)[MAX_ID_COUNT], \int(&toMinCurL1Idx)[MAX_ID_COUNT]){int money[3];double money_3[3];long int money5[3];bool flg = false;for (int i = 0; i < deg_RG[minCur]; ++i){int it = ReGraph[minCur][i].id;int money0 = ReGraph[minCur][i].money;money[0] = money0;money_3[0] = money0/3.0;money5[0] = mtp5(money0);if (minCur < it){toMinCurL1[it] = true;toMinCurL1Idx[it] = i;for (int j = 0; j < deg_RG[it]; ++j){int ite = ReGraph[it][j].id;int money1 = ReGraph[it][j].money;if (minCur < ite&&judgeMoneyRe(money_3[0],money5[0],money1)){//if(!flg)flg = true;toMinCurL2[ite] = true;money[1] = money1;money_3[1] = money1/3.0;money5[1] = mtp5(money1);for (int k = 0; k < deg_RG[ite]; ++k){int iter = ReGraph[ite][k].id;int money2 = ReGraph[ite][k].money;if (minCur < iter&&iter != it&&judgeMoneyRe(money_3[1],money5[1],money2)){toMinCurL3[iter] = true;}}}}}}return flg;}bool judgeMoneyCycle(int(&stk_mny)[7], int num){int money3;int money5;for (int i = 0; i < num - 1; ++i){if(stk_mny[i]<=715827882){money3 = stk_mny[i]*3;}else{money3 = 2147483647;}if(stk_mny[i + 1]<=429496729){money5 = stk_mny[i+1]*5;}else{money5 = 2147483647;}if (money3<stk_mny[i + 1] || stk_mny[i]>money5)return false;}if(stk_mny[num-1]<=715827882){money3 = stk_mny[num-1]*3;}else{money3 = 2147483647;}if(stk_mny[0]<=429496729){money5 = stk_mny[0]*5;}else{money5 = 2147483647;}if (money3<stk_mny[0] || stk_mny[num - 1]>money5)return false;return true;}void piece_memcpy(char *dest,char *src, const int n) {switch(n) {case 33 : *dest++ = *src++;case 32 : *dest++ = *src++;case 31 : *dest++ = *src++;case 30 : *dest++ = *src++;case 29 : *dest++ = *src++;case 28 : *dest++ = *src++;case 27 : *dest++ = *src++;case 26 : *dest++ = *src++;case 25 : *dest++ = *src++;case 24 : *dest++ = *src++;case 23 : *dest++ = *src++;case 22 : *dest++ = *src++;case 21 : *dest++ = *src++;case 20 : *dest++ = *src++;case 19 : *dest++ = *src++;case 18 : *dest++ = *src++;case 17 : *dest++ = *src++;case 16 : *dest++ = *src++;case 15 : *dest++ = *src++;case 14 : *dest++ = *src++;case 13 : *dest++ = *src++;case 12 : *dest++ = *src++;case 11 : *dest++ = *src++;case 10 : *dest++ = *src++;case 9 : *dest++ = *src++;case 8 : *dest++ = *src++;case 7 : *dest++ = *src++;case 6 : *dest++ = *src++;case 5 : *dest++ = *src++;case 4 : *dest++ = *src++;case 3 : *dest++ = *src++;case 2 : *dest++ = *src++;case 1 : *dest++ = *src++; break;default: exit(0);}}inline int idToAscii(int value, char* dst) {int len = id_set[value].len;memcpy(dst, id_set[value].pos, len);return len;}char* L[5];char* pL[5];void arr2Ascii(int(&arr)[7], int num){char* &p = pL[num - 3];int dig;dig = idToAscii(arr[0], p);p += dig;for (int i = 1; i < num; ++i){*p++ = ',';dig = idToAscii(arr[i], p);p += dig;}*p++ = '\n';}char* pT[4][5];char* T[4][5];int cycle_counts[4];char* task_cycle_pos[20000][10];void arr2AsciiThread(int(&arr)[7], int num, int &thread_id, int &task_id){int cycle_type = num - 3;char* &p = pT[thread_id][cycle_type];int dig;dig = idToAscii(arr[0], p);p += dig;for (int i = 1; i < num; ++i){*p++ = ',';dig = idToAscii(arr[i], p);p += dig;}*p++ = '\n';task_cycle_pos[task_id][cycle_type + 5] = p;}inline void pushMoney(int (&money)[6],double (&money_p2)[6],int (&money3)[6],int money_tmp, int layer){money[layer] = money_tmp;money_p2[layer] = 0.2*money_tmp;money3[layer] = mtp3(money_tmp);}inline bool judgeMoney(double money_p2,int money3, int money){return (money_p2<=money&&money<=money3);}void DFS(const int minCur, \int(&stk)[7], int(&stk_mny)[7], bool(&onStack)[MAX_ID_COUNT], bool(&toMinCurL1)[MAX_ID_COUNT], \bool(&toMinCurL2)[MAX_ID_COUNT], \bool(&toMinCurL3)[MAX_ID_COUNT], \int(&toMinCurL1Idx)[MAX_ID_COUNT], int &thread_id, int &task_id,\int (&money)[6],double (&money_p2)[6],int (&money3)[6]){stk[0] = minCur;int s1 = minCur;onStack[s1] = true;int money_tmp;double moneyO_3;int moneyO5;for (int i2 = 0; i2 < deg_G[s1]; i2++){auto s2 = Graph[s1][i2].id;if (s2 <= minCur) continue;stk[1] = s2;onStack[s2] = true;money_tmp = Graph[s1][i2].money;pushMoney(money,money_p2,money3, money_tmp, 0);moneyO_3 = money_tmp/3.0;moneyO5 = mtp5(money_tmp);for (int i3 = 0; i3 < deg_G[s2]; i3++){auto s3 = Graph[s2][i3].id;money_tmp = Graph[s2][i3].money;if (s3 <= minCur || onStack[s3] || !judgeMoney(money_p2[0],money3[0],money_tmp)) continue;pushMoney(money,money_p2,money3, money_tmp, 1);//pushMoney(stk_mny, s2, i3, 1);stk[2] = s3;onStack[s3] = true;if (toMinCurL1[s3]){//pushMoneyLast(stk_mny, minCur, toMinCurL1Idx[s3], 2);money_tmp = ReGraph[minCur][toMinCurL1Idx[s3]].money;//if (judgeMoneyCycle(stk_mny, 3))if (judgeMoney(money_p2[1],money3[1],money_tmp)&&judgeMoney(moneyO_3,moneyO5,money_tmp)){arr2AsciiThread(stk, 3, thread_id, task_id);cycle_counts[thread_id]++;}}for (int i4 = 0; i4 < deg_G[s3]; i4++){auto s4 = Graph[s3][i4].id;money_tmp = Graph[s3][i4].money;if (onStack[s4] || s4 <= minCur||!judgeMoney(money_p2[1],money3[1],money_tmp)) continue;stk[3] = s4;onStack[s4] = true;//pushMoney(stk_mny, s3, i4, 2);pushMoney(money,money_p2,money3, money_tmp, 2);if (toMinCurL1[s4]){money_tmp = ReGraph[minCur][toMinCurL1Idx[s4]].money;//pushMoneyLast(stk_mny, minCur, toMinCurL1Idx[s4], 3);//if (judgeMoneyCycle(stk_mny, 4))if (judgeMoney(money_p2[2],money3[2],money_tmp)&&judgeMoney(moneyO_3,moneyO5,money_tmp)){arr2AsciiThread(stk, 4, thread_id, task_id);cycle_counts[thread_id]++;}}for (int i5 = 0; i5 < deg_G[s4]; i5++){auto s5 = Graph[s4][i5].id;money_tmp = Graph[s4][i5].money;if (minCur >= s5 || onStack[s5] || !judgeMoney(money_p2[2],money3[2],money_tmp)||!toMinCurL3[s5] && !toMinCurL2[s5] && !toMinCurL1[s5]) continue;stk[4] = s5;pushMoney(money,money_p2,money3, money_tmp, 3);if (toMinCurL1[s5]){//stk[4] = s5;//pushMoney(stk_mny, s4, i5, 3);//pushMoneyLast(stk_mny, minCur, toMinCurL1Idx[s5], 4);money_tmp = ReGraph[minCur][toMinCurL1Idx[s5]].money;if (judgeMoney(money_p2[3],money3[3],money_tmp)&&judgeMoney(moneyO_3,moneyO5,money_tmp)){arr2AsciiThread(stk, 5, thread_id, task_id);cycle_counts[thread_id]++;}}if (toMinCurL2[s5]){//stk[4] = s5;//pushMoney(stk_mny, s4, i5, 3);//pushMoney(money,money_p2,money3, money_tmp, 3);for (int i6 = 0; i6 < deg_G[s5]; i6++){auto s6 = Graph[s5][i6].id;money_tmp = Graph[s5][i6].money;if (minCur<s6&&!onStack[s6] && toMinCurL1[s6]&&judgeMoney(money_p2[3],money3[3],money_tmp)){stk[5] = s6;//money_tmp = Graph[s5][i6].money;//pushMoney(stk_mny, s5, i6, 4);//pushMoneyLast(stk_mny, minCur, toMinCurL1Idx[s6], 5);pushMoney(money,money_p2,money3, money_tmp, 4);money_tmp = ReGraph[minCur][toMinCurL1Idx[s6]].money;if (judgeMoney(money_p2[4],money3[4],money_tmp)&&judgeMoney(moneyO_3,moneyO5,money_tmp)){arr2AsciiThread(stk, 6, thread_id, task_id);cycle_counts[thread_id]++;}}}}if (toMinCurL3[s5]){//stk[4] = s5;onStack[s5] = true;//pushMoney(stk_mny, s4, i5, 3);for (int i6 = 0; i6 < deg_G[s5]; i6++){auto s6 = Graph[s5][i6].id;money_tmp = Graph[s5][i6].money;if (toMinCurL2[s6] && !onStack[s6] && minCur <= s6&&judgeMoney(money_p2[3],money3[3],money_tmp)){stk[5] = s6;//pushMoney(stk_mny, s5, i6, 4);pushMoney(money,money_p2,money3, money_tmp, 4);for (int i7 = 0; i7 < deg_G[s6]; i7++){auto s7 = Graph[s6][i7].id;money_tmp = Graph[s6][i7].money;if (minCur<s7&&toMinCurL1[s7] && !onStack[s7]&&judgeMoney(money_p2[4],money3[4],money_tmp)){stk[6] = s7;pushMoney(money,money_p2,money3, money_tmp, 5);//pushMoney(stk_mny, s6, i7, 5);//pushMoneyLast(stk_mny, minCur, toMinCurL1Idx[s7], 6);money_tmp = ReGraph[minCur][toMinCurL1Idx[s7]].money;//if (judgeMoneyCycle(stk_mny, 7))if (judgeMoney(money_p2[5],money3[5],money_tmp)&&judgeMoney(moneyO_3,moneyO5,money_tmp)){arr2AsciiThread(stk, 7, thread_id, task_id);cycle_counts[thread_id]++;}}}}}onStack[s5] = false;}}onStack[s4] = false;}onStack[s3] = false;}onStack[s2] = false;}onStack[minCur] = false;}void do_task(int cur_task_id, int(&stk)[7], int(&stk_mny)[7], bool(&onStack)[MAX_ID_COUNT], \bool(&toMinCurL1)[MAX_ID_COUNT], bool(&toMinCurL2)[MAX_ID_COUNT], bool(&toMinCurL3)[MAX_ID_COUNT], \int(&toMinCurL1Idx)[MAX_ID_COUNT], vector<int> &task_set, int &thread_id, int &task_id,\int (&money)[6],double (&money_p2)[6],int (&money3)[6]){int from = task_set[cur_task_id];int to = task_set[cur_task_id + 1];for (int minCur = from; minCur < to; ++minCur){memset(toMinCurL1, false, id_num*sizeof(bool));memset(toMinCurL2, false, id_num*sizeof(bool));memset(toMinCurL3, false, id_num*sizeof(bool));bool flg = DFSInv(minCur, toMinCurL1, toMinCurL2, toMinCurL3, toMinCurL1Idx);if (flg == true){DFS(minCur, stk, stk_mny, onStack, toMinCurL1, toMinCurL2, toMinCurL3, toMinCurL1Idx, thread_id, task_id,money,money_p2,money3);}}}vector<mutex> mtx(20000);void dfs_thread(int thread_id, vector<int> &task_set){bool onStack[MAX_ID_COUNT] = { false };int stk[7];int stk_mny[7];bool toMinCurL1[MAX_ID_COUNT] = { false };bool toMinCurL2[MAX_ID_COUNT] = { false };bool toMinCurL3[MAX_ID_COUNT] = { false };int toMinCurL1Idx[MAX_ID_COUNT];int cur_task_id = 0;char* ptr_from_cur[5] = { pT[thread_id][0], pT[thread_id][1], pT[thread_id][2],pT[thread_id][3],pT[thread_id][4] };int money[6];double money_p2[6];int money3[6];while(cur_task_id<task_set.size()-1){if(mtx[cur_task_id].try_lock()){memcpy(task_cycle_pos[cur_task_id], ptr_from_cur, sizeof(char*)* 5);memcpy(task_cycle_pos[cur_task_id]+5, ptr_from_cur, sizeof(char*)* 5);do_task(cur_task_id, stk, stk_mny, onStack, toMinCurL1, toMinCurL2, toMinCurL3, toMinCurL1Idx, task_set,thread_id,cur_task_id,\money,money_p2,money3);memcpy(ptr_from_cur, task_cycle_pos[cur_task_id]+5, sizeof(char*)* 5);}cur_task_id++;}}int task_num;void run_dfs(){vector<int> task_set;task_num = id_num / 100+1;for (int i = 0; i < task_num; i++){task_set.push_back(i * 100);}task_set.push_back(id_num+1);thread threads[4];for (int i = 0; i < 4; i++){threads[i] = thread(dfs_thread, i, ref(task_set));}for (auto& it : threads) {it.join();}}int getDigits(int v) {if (v < 10) return 1;if (v < 100) return 2;if (v < 1000) return 3;if (v < 1000000000000) { // 10^12if (v < 100000000) { // 10^7if (v < 1000000) { // 10^6if (v < 10000) return 4;return 5 + (v >= 100000); // 10^5}return 7 + (v >= 10000000); // 10^7}if (v < 10000000000) { // 10^10return 9 + (v >= 1000000000); // 10^9c}return 11 + (v >= 100000000000); // 10^11}return 12 + getDigits(v / 1000000000000); // 10^12}int intToAscii(int value, char* dst) {static const char digits[] = {"0001020304050607080910111213141516171819""222324252627282930313233343536373839""4041424344454647484950515253545556575859""6061626364656667686970717273747576777879""8081828384858687888990919293949596979899" };const int length = getDigits(value);int next = length - 1;while (value >= 100) {const int i = (value % 100) * 2;value /= 100;dst[next - 1] = digits[i];dst[next] = digits[i + 1];next -= 2;}// Handle last 1-2 digitsif (value < 10) {dst[next] = '0' + int(value);}else {int i = int(value) * 2;dst[next - 1] = digits[i];dst[next] = digits[i + 1];}return length;}char *LAll;int cycleCount = 0;void write_th(int th_id,int (&task_sz)[5],void *(&dst_ptr)){int from = task_sz[th_id];int size = task_sz[th_id+1]-from;// cout<<endl<<th_id<<" "<<from<<" "<<size<<endl;// cout<<(char*)dst_ptr+size-(char*)dst_ptr<<endl;memcpy((char*)dst_ptr+from,LAll+from,size);}void write_result(){char* pAll = LAll;int dig = intToAscii(cycleCount, pAll);int sz;pAll += dig;*pAll++ = '\n';for (int j = 0; j < 5; j++){for (int i = 0; i < task_num; i++){auto pos_ptr = task_cycle_pos[i];sz = pos_ptr[j + 5] - pos_ptr[j];memcpy(pAll, (pos_ptr[j]), sz);pAll += sz;}}int total = pAll - LAll;int task_sz = total/4;int task_set[5] = {0,task_sz,2*task_sz,3*task_sz,total+1};int fd = open(result_data_path.c_str(),O_RDWR|O_CREAT,0666);//cout<<111<<endl;truncate(result_data_path.c_str(),total);//cout<<total<<endl;//cout<<task_sz<<endl;// for(int i=0;i<5;i++)// {// cout<<task_set[i]<<endl;// }void *dst_ptr=mmap(NULL,total,PROT_WRITE|PROT_READ,MAP_SHARED,fd,0);//memcpy(dst_ptr,LAll,total);thread threads[4];for (int i = 0; i < 4; i++){threads[i] = thread(write_th, i, ref(task_set),ref(dst_ptr));}for (auto& it : threads) {it.join();}//exit(0);//munmap(dst_ptr,total);}int main(){#ifdef TESTtest_data_path = "./data/std/test_data.txt";result_data_path = "./data/myresult.txt";auto s1 = chrono::steady_clock::now();#endif// int a = 3;// double b = 3.3;// bool flg = (a>b);// cout << flg <<endl;read_raw_data();#ifdef TESTauto s2 = chrono::steady_clock::now();cout << "read data time: " << chrono::duration_cast<chrono::duration<double>>(s2 - s1).count() << " s" << endl;#endif//id_map();buid_graph();#ifdef TESTauto s3 = chrono::steady_clock::now();cout << "build graph time: " << chrono::duration_cast<chrono::duration<double>>(s3 - s2).count() << " s" << endl;#endif// L[0] = new char[MAX_CYCLE3_CHAR_SIZE / 10];// L[1] = new char[MAX_CYCLE4_CHAR_SIZE / 100];// L[2] = new char[MAX_CYCLE5_CHAR_SIZE / 100];// L[3] = new char[MAX_CYCLE6_CHAR_SIZE / 100];// L[4] = new char[MAX_CYCLE7_CHAR_SIZE / 100];for (int i = 0; i<5; ++i){pL[i] = L[i];}LAll = new char[MAX_CYCLE_CHAR_SIZE];for (int i = 0; i < 4; i++){T[i][0] = new char[MAX_CYCLE3_CHAR_SIZE];pT[i][0] = T[i][0];T[i][1] = new char[MAX_CYCLE4_CHAR_SIZE];pT[i][1] = T[i][1];T[i][2] = new char[MAX_CYCLE5_CHAR_SIZE];pT[i][2] = T[i][2];T[i][3] = new char[MAX_CYCLE6_CHAR_SIZE];pT[i][3] = T[i][3];T[i][4] = new char[MAX_CYCLE7_CHAR_SIZE];pT[i][4] = T[i][4];}run_dfs();#ifdef TESTauto s4 = chrono::steady_clock::now();cout << "dfs time: " << chrono::duration_cast<chrono::duration<double>>(s4 - s3).count() << " s" << endl;#endiffor (int i = 0; i < 4; i++)cycleCount += cycle_counts[i];write_result();#ifdef TESTauto s5 = chrono::steady_clock::now();cout << "write result time: " << chrono::duration_cast<chrono::duration<double>>(s5 - s4).count() << " s" << endl;#endif#ifdef TESTcout << id_num << endl;cout << cycleCount << endl;#endifexit(0);}

本内容不代表本网观点和政治立场,如有侵犯你的权益请联系我们处理。
网友评论
网友评论仅供其表达个人看法,并不表明网站立场。