江东的笔记

Be overcome difficulties is victory

0%

中文语义病句识别挑战

预测句子是否是语义病句,语义错误和拼写错误、语法错误

中文语义病句识别挑战赛

一、赛事背景

近年来随着自媒体热潮的掀起,人人都是信息的生产者,互联网上文本错误的内容暴增,如何避免这些文本错误,成为了人们迫切关注的问题。因此,各大有关文本校对的比赛蜂拥而至。然而,过往的文本错误主要针对拼写错误和语法错误,这些错误对于人类来说相对简单,往往是由外国语言学习者和中文母语写作者的疏忽而产生的。对于出版、教育等一些对深层次的中文语义错误识别有需求的行业,中文语义病句的识别将会有更大的帮助。语义病句经常出现在初高中的语文考试题目中,用来衡量学生对语文知识的掌握程度,这类语义病句对于学生来说是比较困难的,对于研究也有重大意义。

二、赛事任务

中文语义病句识别是一个二分类的问题,预测句子是否是语义病句。语义错误和拼写错误、语法错误不同,语义错误更加关注句子语义层面的合法性,语义病句例子如下表所示。

病句 解析
英法联军烧毁并洗劫了北京圆明园。 应该先“洗劫”,再“烧毁”
山上的水宝贵,我们把它留给晚上来的人喝。 歧义,“晚上/来”“晚/上来”
国内彩电市场严重滞销。 “市场”不能“滞销”

三、评审规则

1.数据说明

本次比赛使用的数据一部分来自网络上的中小学病句题库,一部分来自人工标注。每条数据包括句子id、句子标签(0:正确句子/1:病句)、句子,以上三个字段用制表符分隔。数据格式示例如下表所示:

id 标签 句子
1 1 英法联军烧毁并洗劫了北京圆明园。
2 1 山上的水宝贵,我们把它留给晚上来的人喝。
3 0 国内彩电严重滞销。

本次大赛由哈工大讯飞联合实验室提供的数据作为训练样本。训练集中病句和正确句子的比例大致7:3,要求参赛者使用且仅能使用组织方提供的训练集进行训练,不允许使用额外的人工标注的数据进行训练、更不允许将测试集的数据用于训练。此次比赛分为初赛和复赛两个阶段,两个阶段所使用的训练集相同。

2.评估指标

本模型依据提交的结果文件,采用针对语义病句的F1-score进行评价。

实现代码

导入所需包

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
import os
import random
from functools import partial
from sklearn.utils.class_weight import compute_class_weight

import numpy as np
import paddle
import paddle as P
import paddle.nn.functional as F
import paddlenlp as ppnlp
import pandas as pd
from paddle.io import Dataset
from paddlenlp.data import Stack, Tuple, Pad
from paddlenlp.datasets import MapDataset
from paddlenlp.transformers import LinearDecayWithWarmup
from paddlenlp.transformers import ErnieGramModel
from sklearn.model_selection import StratifiedKFold
from tqdm import tqdm
import numpy as np
import paddle.fluid as fluid
import paddle.nn as nn
from sklearn.metrics import f1_score

初始化所有需要的用到的参数

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24

# =============================== 初始化 ========================
class Config:
text_col = 'text'
target_col = 'label'
# 最大长度大小
max_len = 90
# 模型运行批处理大小
batch_size = 32
target_size = 2
seed = 71
n_fold = 5
# 训练过程中的最大学习率
learning_rate = 5e-5
# 训练轮次
epochs = 5 # 3
# 学习率预热比例
warmup_proportion = 0.1
# 权重衰减系数,类似模型正则项策略,避免模型过拟合
weight_decay = 0.01
# model_name = "ernie-gram-zh"
# model_name = "ernie-doc-base-zh"
model_name = "ernie-1.0"
print_freq = 100

使用FGM(Fast Gradient Method)对抗训练过程

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
class FGM():
"""针对embedding层梯度上升干扰的对抗训练方法,Fast Gradient Method(FGM)"""

def __init__(self, model):
self.model = model
self.backup = {}

def attack(self, epsilon=0.15, emb_name='embeddings'):
# emb_name这个参数要换成你模型中embedding的参数名
for name, param in self.model.named_parameters():
if not param.stop_gradient and emb_name in name: # 检验参数是否可训练及范围
self.backup[name] = param.numpy() # 备份原有参数值
grad_tensor = paddle.to_tensor(param.grad) # param.grad是个numpy对象
norm = paddle.norm(grad_tensor) # norm化
if norm != 0:
r_at = epsilon * grad_tensor / norm
param.add(r_at) # 在原有embed值上添加向上梯度干扰

def restore(self, emb_name='embeddings'):
# emb_name这个参数要换成你模型中embedding的参数名
for name, param in self.model.named_parameters():
if not param.stop_gradient and emb_name in name:
assert name in self.backup
param.set_value(self.backup[name]) # 将原有embed参数还原
self.backup = {}

设置随机种子,保证结果复现

1
2
3
4
5
def seed_torch(seed=42):
random.seed(seed)
os.environ['PYTHONHASHSEED'] = str(seed)
np.random.seed(seed)

1
2
import os
os.listdir('/home/aistudio/data/data176811/')
['data.xlsx', 'test1.csv', '提交示例.csv']

数据的读取

1
2
3
4
5
6
CFG = Config()
seed_torch(seed=CFG.seed)
train = pd.read_excel('data/data176811/data.xlsx')
test = pd.read_table('data/data176811/test1.csv')
# train['text'] = train.apply(lambda row: concat_text(row), axis=1)
# test['text'] = test.apply(lambda row: concat_text(row), axis=1)
1
train
id label text
0 1 1 通过大力发展社区教育,使我省全民终身学习的教育体系已深入人心。
1 2 1 再次投入巨资的英超劲旅曼城队能否在2010-2011年度的英超联赛中夺得英超冠军,曼联、切尔...
2 3 1 广西居民纸质图书的阅读率偏低,手机阅读将成为了广西居民极倾向的阅读方式。
3 4 1 文字书写时代即将结束,预示着人与字之间最亲密的一种关系已经终结。与此同时,屏幕文化造就了另一...
4 5 1 安徽合力公司2006年叉车销售强劲,销售收入涨幅很有可能将超过40%以上。公司预计2006年...
... ... ... ...
45242 45244 0 进入5月以来,全国新增人感染H7N9禽流感病例呈明显下降趋势。
45243 45245 1 建设中国新一代天气雷达监测网,能够明显改善对热带气旋或台风登陆位置及强度预报的准确性,尤其对...
45244 45246 1 每当回忆起和他朝夕相处的一段生活,他那循循善诱的教导和那和蔼可亲的音容笑貌,又重新出现在我的面前。
45245 45247 1 8月,延安市公开拍卖35辆超编超标公务车。在拍卖过程中,多辆年份较新、行驶里程较少的公务车竞...
45246 45248 1 清华大学联合剑桥大学、麻省理工学院,成立低碳能源大学联盟未来交通研究中心,他们试图寻找解决北...

45247 rows × 3 columns

定义5折交叉验证

1
2
3
4
5
6
# CV split
folds = train.copy()
Fold = StratifiedKFold(n_splits=CFG.n_fold, shuffle=True, random_state=CFG.seed)
for n, (train_index, val_index) in enumerate(Fold.split(folds, folds[CFG.target_col])):
folds.loc[val_index, 'fold'] = int(n)
folds['fold'] = folds['fold'].astype(int)

数据预处理操作

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
# ====================================== 数据集以及转换函数==============================
class CustomDataset(Dataset):
def __init__(self, df):
self.data = df.values.tolist()
self.texts = df[CFG.text_col]
self.labels = df[CFG.target_col]

def __len__(self):
return len(self.texts)

def __getitem__(self, idx):
"""
索引数据
:param idx:
:return:
"""
text = str(self.texts[idx])
label = self.labels[idx]
example = {'text': text, 'label': label}

return example

将data进行Embedding

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
def convert_example(example, tokenizer, max_seq_length=512, is_test=False):
"""
创建Bert输入
::
0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1
| first sequence | second sequence |
Returns:
input_ids(obj:`list[int]`): The list of token ids.
token_type_ids(obj: `list[int]`): List of sequence pair mask.
label(obj:`numpy.array`, data type of int64, optional): The input label if not is_test.
"""
encoded_inputs = tokenizer(text=example["text"], max_seq_len=max_seq_length)
input_ids = encoded_inputs["input_ids"]
# print(input_ids)
token_type_ids = encoded_inputs["token_type_ids"]

if not is_test:
label = np.array([example["label"]], dtype="int64")
return input_ids, token_type_ids, label
else:
return input_ids, token_type_ids

创建DataLoader

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
def create_dataloader(dataset,
mode='train',
batch_size=1,
batchify_fn=None,
trans_fn=None):
if trans_fn:
dataset = dataset.map(trans_fn)

shuffle = True if mode == 'train' else False
if mode == 'train':
batch_sampler = paddle.io.DistributedBatchSampler(
dataset, batch_size=batch_size, shuffle=shuffle)
else:
batch_sampler = paddle.io.BatchSampler(
dataset, batch_size=batch_size, shuffle=shuffle)

return paddle.io.DataLoader(
dataset=dataset,
batch_sampler=batch_sampler,
collate_fn=batchify_fn,
return_list=True)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
# tokenizer = ppnlp.transformers.ErnieTokenizer.from_pretrained(CFG.model_name)
# tokenizer = ppnlp.transformers.ErnieGramTokenizer.from_pretrained(CFG.model_name)
# tokenizer = ppnlp.transformers.ErnieTokenizer.from_pretrained(CFG.model_name)
if CFG.model_name == 'ernie-1.0':

tokenizer = ppnlp.transformers.ErnieTokenizer.from_pretrained('ernie-1.0')
elif CFG.model_name == 'ernie-doc-base-zh':
tokenizer = ppnlp.transformers.ErnieDocTokenizer.from_pretrained('ernie-doc-base-zh')
else:
tokenizer = ppnlp.transformers.ErnieGramTokenizer.from_pretrained(CFG.model_name)
trans_func = partial(
convert_example,
tokenizer=tokenizer,
max_seq_length=CFG.max_len)
batchify_fn = lambda samples, fn=Tuple(
Pad(axis=0, pad_val=tokenizer.pad_token_id), # input
Pad(axis=0, pad_val=tokenizer.pad_token_type_id), # segment
Stack(dtype="int64") # label
): [data for data in fn(samples)]

评估模型

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
# ====================================== 验证与预测函数 ==============================

@paddle.no_grad()
def evaluate(model, criterion, metric, data_loader):
"""
验证函数
"""
model.eval()
metric.reset()
losses = []
preds_list = []
labels_list = []

for batch in data_loader:
input_ids, token_type_ids, labels = batch
logits = model(input_ids, token_type_ids)
preds_list.append(np.argmax(logits.numpy(), axis=1))
labels_list.append(labels)
loss = criterion(logits, labels)
losses.append(loss.numpy())
correct = metric.compute(logits, labels)
metric.update(correct)
accu = metric.accumulate()
f1_macro = f1_score(np.concatenate(preds_list, axis=0), np.concatenate(labels_list, axis=0), average='binary')
print("eval loss: %.5f, accu: %.5f" % (np.mean(losses), accu))
model.train()
metric.reset()
return accu, f1_macro

模型进行预测

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
def predict(model, data, tokenizer, batch_size=1):
"""
预测函数
"""
examples = []
for text in data:
input_ids, segment_ids = convert_example(
text,
tokenizer,
max_seq_length=CFG.max_len,
is_test=True)
examples.append((input_ids, segment_ids))

batchify_fn = lambda samples, fn=Tuple(
Pad(axis=0, pad_val=tokenizer.pad_token_id), # input id
Pad(axis=0, pad_val=tokenizer.pad_token_id), # segment id
): fn(samples)

# Seperates data into some batches.
batches = []
one_batch = []
for example in examples:
one_batch.append(example)
if len(one_batch) == batch_size:
batches.append(one_batch)
one_batch = []
if one_batch:
# The last batch whose size is less than the config batch_size setting.
batches.append(one_batch)

results = []
model.eval()
for batch in tqdm(batches):
input_ids, segment_ids = batchify_fn(batch)
input_ids = paddle.to_tensor(input_ids)
segment_ids = paddle.to_tensor(segment_ids)
logits = model(input_ids, segment_ids)
probs = F.softmax(logits, axis=1)
results.append(probs.numpy())
return np.vstack(results)

预测数据

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
def inference():
model_paths = [
f'{CFG.model_name}_fold0.bin',
f'{CFG.model_name}_fold1.bin',
f'{CFG.model_name}_fold2.bin',
f'{CFG.model_name}_fold3.bin',
f'{CFG.model_name}_fold4.bin',
]

if CFG.model_name == 'ernie-1.0':
model = ppnlp.transformers.ErnieForSequenceClassification.from_pretrained(CFG.model_name,
num_classes=CFG.target_size)
elif CFG.model_name == 'ernie-doc-base-zh':
model = ppnlp.transformers.ErnieDocForSequenceClassification.from_pretrained('ernie-doc-base-zh',
num_classes=CFG.target_size)

else:

model = ppnlp.transformers.ErnieGramForSequenceClassification.from_pretrained(CFG.model_name,
num_classes=CFG.target_size)
# model = ppnlp.transformers.ErnieForSequenceClassification.from_pretrained(CFG.model_name,
# num_classes=25)
# model = ppnlp.transformers.ErnieGramForSequenceClassification.from_pretrained(CFG.model_name,
# num_classes=25)
fold_preds = []
for model_path in model_paths:
model.load_dict(P.load(model_path))
pred = predict(model, test.to_dict(orient='records'), tokenizer, 16)
fold_preds.append(pred)
preds = np.mean(fold_preds, axis=0)
np.save("preds.npy", preds)
labels = np.argmax(preds, axis=1)
test['label'] = labels
test[['id', 'label']].to_csv('paddle.csv',sep='\t', index=None)

训练函数

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
def train():
# ==================================== 交叉验证训练 ==========================
for fold in range(CFG.n_fold):
print(f"===============training fold_nth:{fold + 1}======================")
trn_idx = folds[folds['fold'] != fold].index
val_idx = folds[folds['fold'] == fold].index

train_folds = folds.loc[trn_idx].reset_index(drop=True)
valid_folds = folds.loc[val_idx].reset_index(drop=True)
# 训练集的数据格式转换
train_dataset = CustomDataset(train_folds)
train_ds = MapDataset(train_dataset)
# 验证集的数据转换
dev_dataset = CustomDataset(valid_folds)
dev_ds = MapDataset(dev_dataset)
# print(trans_func)
train_data_loader = create_dataloader(
train_ds,
mode='train',
batch_size=CFG.batch_size,
batchify_fn=batchify_fn,
trans_fn=trans_func)
dev_data_loader = create_dataloader(
dev_ds,
mode='dev',
batch_size=CFG.batch_size,
batchify_fn=batchify_fn,
trans_fn=trans_func)

# model = ppnlp.transformers.ErnieGramForSequenceClassification.from_pretrained(CFG.model_name,
# num_classes=25)
# model = ppnlp.transformers.ErnieForSequenceClassification.from_pretrained(CFG.model_name,
# num_classes=25)
# 选择模型
if CFG.model_name == 'ernie-1.0':
model = ppnlp.transformers.ErnieForSequenceClassification.from_pretrained(CFG.model_name,
num_classes=CFG.target_size)
elif CFG.model_name == 'ernie-doc-base-zh':
model = ppnlp.transformers.ErnieDocForSequenceClassification.from_pretrained('ernie-doc-base-zh',
num_classes=CFG.target_size)
else:
model = ppnlp.transformers.ErnieGramForSequenceClassification.from_pretrained(CFG.model_name,
num_classes=CFG.target_size)
# print(model)
num_training_steps = len(train_data_loader) * CFG.epochs
lr_scheduler = LinearDecayWithWarmup(CFG.learning_rate, num_training_steps, CFG.warmup_proportion)
# 构建优化器
optimizer = paddle.optimizer.AdamW(
learning_rate=lr_scheduler,
parameters=model.parameters(),
weight_decay=CFG.weight_decay,
apply_decay_param_fun=lambda x: x in [
p.name for n, p in model.named_parameters()
if not any(nd in n for nd in ["bias", "norm"])
])
# 定义交叉熵函数
criterion = paddle.nn.loss.CrossEntropyLoss()
# criterion = paddle.nn.loss.CrossEntropyLoss(weight=paddle.to_tensor(np.array(weight)))
# criterion = FocalLoss()

metric = paddle.metric.Accuracy()

global_step = 0
best_val_acc = 0
best_val_f1 = 0
fgm = FGM(model)
for epoch in range(1, CFG.epochs + 1):
for step, batch in enumerate(train_data_loader, start=1):
input_ids, segment_ids, labels = batch
# print(segment_ids.shape)
logits = model(input_ids, segment_ids)
# probs_ = paddle.to_tensor(logits, dtype="float64")
loss = criterion(logits, labels)
probs = F.softmax(logits, axis=1)
correct = metric.compute(probs, labels)
metric.update(correct)
acc = metric.accumulate()
f1_macro = f1_score(np.argmax(probs.numpy(), axis=1), labels, average='binary')
global_step += 1
if global_step % CFG.print_freq == 0:
print("global step %d, epoch: %d, batch: %d, loss: %.5f, acc: %.5f,f1_macro: %.5f" % (
global_step, epoch, step, loss, acc, f1_macro))
loss.backward()

# 对抗训练
fgm.attack() # 在embedding上添加对抗扰动
logits_adv = model(input_ids, segment_ids)
loss_adv = criterion(logits_adv, labels)
loss_adv.backward() # 反向传播,并在正常的grad基础上,累加对抗训练的梯度
fgm.restore() # 恢复embedding参数

optimizer.step()
lr_scheduler.step()
optimizer.clear_grad()
acc, f1 = evaluate(model, criterion, metric, dev_data_loader)
if acc > best_val_acc:
best_val_acc = acc
#if f1 > best_val_f1:
#best_val_f1 = f1
P.save(model.state_dict(), f'{CFG.model_name}_fold{fold}.bin')
print('Best Val acc %.5f' % best_val_acc)
del model
break
1
train()
===============training fold_nth:1======================


[2022-11-23 19:03:33,381] [    INFO] - Already cached /home/aistudio/.paddlenlp/models/ernie-1.0/ernie_v1_chn_base.pdparams
W1123 19:03:33.386370   628 gpu_resources.cc:61] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 11.2, Runtime API Version: 11.2
W1123 19:03:33.390496   628 gpu_resources.cc:91] device: 0, cuDNN Version: 8.2.


global step 100, epoch: 1, batch: 100, loss: 0.46791, acc: 0.73188,f1_macro: 0.89655
global step 200, epoch: 1, batch: 200, loss: 0.65628, acc: 0.73625,f1_macro: 0.79245
global step 300, epoch: 1, batch: 300, loss: 0.73637, acc: 0.74031,f1_macro: 0.79245
global step 400, epoch: 1, batch: 400, loss: 0.57431, acc: 0.74234,f1_macro: 0.83636
global step 500, epoch: 1, batch: 500, loss: 0.37923, acc: 0.74138,f1_macro: 0.91525
global step 600, epoch: 1, batch: 600, loss: 0.38467, acc: 0.74344,f1_macro: 0.91228
global step 700, epoch: 1, batch: 700, loss: 0.51419, acc: 0.74562,f1_macro: 0.78261
global step 800, epoch: 1, batch: 800, loss: 0.38103, acc: 0.74750,f1_macro: 0.89286
global step 900, epoch: 1, batch: 900, loss: 0.34946, acc: 0.74899,f1_macro: 0.91304
global step 1000, epoch: 1, batch: 1000, loss: 0.53628, acc: 0.75106,f1_macro: 0.92308
global step 1100, epoch: 1, batch: 1100, loss: 0.42920, acc: 0.75298,f1_macro: 0.91525
eval loss: 0.44171, accu: 0.78530
Best Val acc 0.78530
global step 1200, epoch: 2, batch: 68, loss: 0.22114, acc: 0.83042,f1_macro: 0.91667
global step 1300, epoch: 2, batch: 168, loss: 0.29129, acc: 0.83464,f1_macro: 0.90476
global step 1400, epoch: 2, batch: 268, loss: 0.58879, acc: 0.83605,f1_macro: 0.80000
global step 1500, epoch: 2, batch: 368, loss: 0.42207, acc: 0.83993,f1_macro: 0.85714
global step 1600, epoch: 2, batch: 468, loss: 0.18622, acc: 0.83747,f1_macro: 0.95833
global step 1700, epoch: 2, batch: 568, loss: 0.26564, acc: 0.83814,f1_macro: 0.93617
global step 1800, epoch: 2, batch: 668, loss: 0.38719, acc: 0.84108,f1_macro: 0.86364
global step 1900, epoch: 2, batch: 768, loss: 0.35982, acc: 0.84204,f1_macro: 0.88889
global step 2000, epoch: 2, batch: 868, loss: 0.33469, acc: 0.84465,f1_macro: 0.89474
global step 2100, epoch: 2, batch: 968, loss: 0.52275, acc: 0.84475,f1_macro: 0.81818
global step 2200, epoch: 2, batch: 1068, loss: 0.23787, acc: 0.84574,f1_macro: 0.96552
eval loss: 0.45888, accu: 0.79613
Best Val acc 0.79613
global step 2300, epoch: 3, batch: 36, loss: 0.04629, acc: 0.94878,f1_macro: 0.98246
global step 2400, epoch: 3, batch: 136, loss: 0.19579, acc: 0.94003,f1_macro: 0.97778
global step 2500, epoch: 3, batch: 236, loss: 0.03761, acc: 0.93988,f1_macro: 1.00000
global step 2600, epoch: 3, batch: 336, loss: 0.21055, acc: 0.93899,f1_macro: 0.88235
global step 2700, epoch: 3, batch: 436, loss: 0.07569, acc: 0.94030,f1_macro: 0.97674
global step 2800, epoch: 3, batch: 536, loss: 0.17687, acc: 0.94076,f1_macro: 0.97674
global step 2900, epoch: 3, batch: 636, loss: 0.19501, acc: 0.94099,f1_macro: 0.92000
global step 3000, epoch: 3, batch: 736, loss: 0.25105, acc: 0.94153,f1_macro: 0.93617
global step 3100, epoch: 3, batch: 836, loss: 0.09677, acc: 0.94109,f1_macro: 0.95833
global step 3200, epoch: 3, batch: 936, loss: 0.12877, acc: 0.94071,f1_macro: 0.96000
global step 3300, epoch: 3, batch: 1036, loss: 0.12475, acc: 0.94058,f1_macro: 0.96154
eval loss: 0.50610, accu: 0.82055
Best Val acc 0.82055
global step 3400, epoch: 4, batch: 4, loss: 0.03767, acc: 0.98438,f1_macro: 1.00000
global step 3500, epoch: 4, batch: 104, loss: 0.00521, acc: 0.98257,f1_macro: 1.00000
global step 3600, epoch: 4, batch: 204, loss: 0.01782, acc: 0.98100,f1_macro: 1.00000
global step 3700, epoch: 4, batch: 304, loss: 0.03877, acc: 0.97995,f1_macro: 1.00000
global step 3800, epoch: 4, batch: 404, loss: 0.03308, acc: 0.98028,f1_macro: 1.00000
global step 3900, epoch: 4, batch: 504, loss: 0.02905, acc: 0.98065,f1_macro: 1.00000
global step 4000, epoch: 4, batch: 604, loss: 0.00859, acc: 0.97987,f1_macro: 1.00000
global step 4100, epoch: 4, batch: 704, loss: 0.00564, acc: 0.97954,f1_macro: 1.00000
global step 4200, epoch: 4, batch: 804, loss: 0.02505, acc: 0.97905,f1_macro: 1.00000
global step 4300, epoch: 4, batch: 904, loss: 0.01213, acc: 0.97905,f1_macro: 1.00000
global step 4400, epoch: 4, batch: 1004, loss: 0.30969, acc: 0.97883,f1_macro: 0.92683
global step 4500, epoch: 4, batch: 1104, loss: 0.11206, acc: 0.97905,f1_macro: 0.98039
eval loss: 0.65418, accu: 0.82221
Best Val acc 0.82221
global step 4600, epoch: 5, batch: 72, loss: 0.04641, acc: 0.99045,f1_macro: 0.98182
global step 4700, epoch: 5, batch: 172, loss: 0.00841, acc: 0.99001,f1_macro: 1.00000
global step 4800, epoch: 5, batch: 272, loss: 0.00417, acc: 0.99081,f1_macro: 1.00000
global step 4900, epoch: 5, batch: 372, loss: 0.00382, acc: 0.98975,f1_macro: 1.00000
global step 5000, epoch: 5, batch: 472, loss: 0.00420, acc: 0.99060,f1_macro: 1.00000
global step 5100, epoch: 5, batch: 572, loss: 0.01085, acc: 0.99088,f1_macro: 1.00000
global step 5200, epoch: 5, batch: 672, loss: 0.02734, acc: 0.99056,f1_macro: 0.97778
global step 5300, epoch: 5, batch: 772, loss: 0.00837, acc: 0.99053,f1_macro: 1.00000
global step 5400, epoch: 5, batch: 872, loss: 0.05018, acc: 0.99097,f1_macro: 0.97674
global step 5500, epoch: 5, batch: 972, loss: 0.04738, acc: 0.99126,f1_macro: 0.97959
global step 5600, epoch: 5, batch: 1072, loss: 0.01556, acc: 0.99128,f1_macro: 1.00000
eval loss: 0.79161, accu: 0.82597
Best Val acc 0.82597
1
2
3
4
# if __name__ == '__main__':
# train()
# inference()

1
inference()
[2022-11-23 19:47:36,006] [    INFO] - Already cached /home/aistudio/.paddlenlp/models/ernie-1.0/ernie_v1_chn_base.pdparams
100%|██████████| 65/65 [00:02<00:00, 31.80it/s]
100%|██████████| 65/65 [00:01<00:00, 33.80it/s]
100%|██████████| 65/65 [00:01<00:00, 32.83it/s]
100%|██████████| 65/65 [00:01<00:00, 33.76it/s]
100%|██████████| 65/65 [00:01<00:00, 33.67it/s]