Bert模型微调---产生自己的训练数据模型

2年前浏览599

1 引言

尽管已经发展出许多预训练模型，但正如过去试验看到的一样，这些预训练模型还不能真正满足我们岩土工程专业的需要，为了真正达到我们的目的，必须在预训练模型的基础上微调出我们自己的模型GeotechSet，之所以长时间没有这样做，其中一个主要原因是考虑到时间问题，以我目前的硬件配置，训练出一个新的模型需要好几个小时(下面例子的模型训练用了大约50分钟，训练数据1.3M)。这个笔记简要总结了微调模型的过程，检验了训练出来的模型是否可用。

2 训练模型

微调代码保存在training_stsbenchmark.py中，训练数据集保存在datasets文件夹内。

预训练模型可以选择任意的Transformers模型，例如Bert，RoBERTa，XLNet, XLM-R，DistilBERT等(bert-base-uncased, roberta-base, xlm-roberta-base，bert-base-cased)。

(1) 首先embedding预训练模型

word_embedding_model = models.Transformer(model_name)

(2) 应用均值集合，得到一个固定大小的句子向量

pooling_model = models.Pooling(word_embedding_model.get_word_embedding_dimension(),pooling_mode_mean_tokens=True,pooling_mode_cls_token=False,pooling_mode_max_tokens=False)

(3) 产生SentenceTransformer模型

model = SentenceTransformer(modules=[word_embedding_model, pooling_model])

(4) 转换训练数据，把训练数据分成三个数据集，并加载数据：

train_samples = []

dev_samples = []

test_samples = []

train_samples的使用:

train_dataloader = DataLoader(train_samples, shuffle=True, batch_size=train_batch_size)train_loss = losses.CosineSimilarityLoss(model=model)

dev_samples的使用:

evaluator = EmbeddingSimilarityEvaluator.from_input_examples(dev_samples, name='sts-dev')

(5) 训练模型

model.fit(train_objectives=[(train_dataloader, train_loss)],evaluator=evaluator,epochs=num_epochs,evaluation_steps=1000,warmup_steps=warmup_steps,output_path=model_save_path)

(6) 评价模型

model = SentenceTransformer(model_save_path)test_evaluator = EmbeddingSimilarityEvaluator.from_input_examples(test_samples, name='sts-test')test_evaluator(model, output_path=model_save_path)

训练后的模型保存在output文件夹内。

3 验证模型

训练完成后，需要检验这个模型是否正常工作。使用sentence-transformers-similarity.py进行检验，命名为stb，下图中的最后一个。

从test.txt内查找与“step-path failure models of large open pit slopes(大型露天矿边坡的阶梯式破坏模型)”句子最相似的三个句子，结果如下：

[1] Based on the results of the laboratory simulations step-path failure models of large open pit slopes are presented and the influence of intact rock bridge length, step-path overlap and fracture spacing discussed. (基于实验室模拟的结果，提出了大型露天矿边坡的阶梯式破坏模型，并讨论了完整岩桥长度、阶梯式重叠和断裂间距的影响。)

[2] {FracMan} ==>Simulations are presented based on DFN models of a large conceptual rock slope and incorporating varied failure mechanisms, demonstrating the importance of both considering realistic fracture mechanisms and the ability to model complex failure paths involving sliding along discontinuities, dilation, and intact rock fracture. (基于大型岩石边坡的DFN模型并结合不同的破坏机制进行了模拟，证明了考虑真实断裂机制和模拟复杂破坏路径能力的重要性，其中包括沿不连续体的滑动、扩张和完整岩石断裂。)

[3] {Characterisation of High Rock Slopes using an Integrated Numerical Modelling} ==>3. a) Photogrammetry derived point cloud of open pit slope face b) 3D mesh generated from 3D point cloud c) 3D mesh imported into 3D numerical code (Slope Model) B-2. Characterization of brittle fracture in surface mines using conceptual simulation Failure of large rock slopes is often a combination of slip and/or opening of preexissting non-persistent discontinuities and intact rock failure. (3. a) 摄影测量得出的露天矿坡面的点云 b) 从三维点云生成的三维网格 c) 三维网格导入三维数值代码(Slope Model) B-2. 利用概念模拟对露天矿的脆性断裂进行表征，显示大型岩石边坡的破坏往往是滑移和/或预先存在的非贯通不连续体的张开与完整岩石破坏的结合。)

4 结束语

这个笔记使用一个小样本数据集微调出新的训练模型，并检验了新的模型是否可用。结果显示目前使用的微调过程在本机上可以使用。以后我们将逐渐训练出自己的GeotechSet模型。

来源：计算岩土力学

断裂岩土试验

著作权归作者所有，欢迎分享，未经许可，不得转载

首次发布时间：2022-11-20