<div dir="ltr">> Speaking of which, I think it's important to curate a dataset of<br>> success/failure logs with the expected anomalies to be found. Those will<br>> be super useful to prevent regression when trying out new settings or models.<br>> How to store and manage the dataset remains to be defined too.<br>> To give you an idea, fwiw, you can find my original dataset here:<br>>  git clone <a href="https://softwarefactory-project.io/r/logreduce-tests">https://softwarefactory-project.io/r/logreduce-tests</a><br>><br>How did you collect and curate the original dataset? <div>And, how do you expect the new set looks like?</div><div><br></div><div>Cheers,</div><div>Klérisson</div></div>