Int J Sports Med 2023; 44(05): 352-360
DOI: 10.1055/a-1993-2371
Training & Testing

Prediction of Marathon Performance using Artificial Intelligence

1   Centre d'Etudes des Transformations des Activités Physiques et Sportives Normandie Univ, UNIROUEN, CETAPS, 76000 Rouen, France
,
Damien Saboul
2   Research and Innovation, Be-ys-research, Argonay, France
,
Michel Clémençon
1   Centre d'Etudes des Transformations des Activités Physiques et Sportives Normandie Univ, UNIROUEN, CETAPS, 76000 Rouen, France
,
Jérémy Bernard Coquart
1   Centre d'Etudes des Transformations des Activités Physiques et Sportives Normandie Univ, UNIROUEN, CETAPS, 76000 Rouen, France
3   Unité de Recherche Pluridisciplinaire Sport, Santé, Société Eurasport, 413 avenue Eugène Avinée, 59 120 Loos, France
› Author Affiliations
 

Abstract

Although studies used machine learning algorithms to predict performances in sports activities, none, to the best of our knowledge, have used and validated two artificial intelligence techniques: artificial neural network (ANN) and k-nearest neighbor (KNN) in the running discipline of marathon and compared the accuracy or precision of the predicted performances. Official French rankings for the 10-km road and marathon events in 2019 were scrutinized over a dataset of 820 athletes (aged 21, having run 10 km and a marathon in the same year that was run slower, etc.). For the KNN and ANN the same inputs (10-km race time, body mass index, age and sex) were used to solve a linear regression problem to estimate the marathon race time. No difference was found between the actual and predicted marathon performances for either method (p>0,05). All predicted performances were significantly correlated with the actual ones, with very high correlation coefficients (r>0,90; p<0,001). KNN outperformed ANN with a mean absolute error of 2,4 vs 5,6%. The study confirms the validity of both algorithms, with better accuracy for KNN in predicting marathon performance. Consequently, the predictions from these artificial intelligence methods may be used in training programs and competitions.


Introduction

The marathon, an athletic endurance event of 42,195 km, was created for the first modern Olympic Games in Athens in 1896. Since the first “urban tour” marathon in New York City in 1976 [1], the marathon has been gaining in popularity and has evolved from an Olympic event to a worldwide social phenomenon [2]. As enthusiasm for this event has increased [1] [2] [3], race times have steadily improved for the best runners (e. g., Top 100 world best performers in Boston Marathon between 1990 and 2010 in the study by Marc et al., and Top 100, Top 10 and winners from 1897 to 2017 in Knechtle at al. study) [3] [4]. From recreational runners of all ages to elite athletes, the objectives may differ widely, from being a finisher, to running the race as fast as possible, to winning it and/or breaking records (e. g., personal, national, world records) to win money (i. e., economic reasons) [5] [6]. Although long-distance performances, as in the marathon, can be influenced by factors beyond the athlete’s control (e. g., climate conditions, seasonal characteristics like temperature, humidity and barometric pressure, etc.) [7] [8], they mainly depend on personal characteristics (age, sex, physical qualities, psychological traits and states, etc.) and training variables (tactics, pacing strategy, etc.) [4] [9] [10] [11] [12] [13]. For example, Weiss et al. [7] showed that temperature and humidity affect pacing in age group marathoners differently (i. e., slowing down for runners of both sexes aged 20–59 with increasing temperature, and slowing down for runners aged under 20 and over 80 with increasing humidity). Other studies [12] [13] indicated that pacing strategy which may also be dependent on the profile of the runners in relation to age (e. g., pace changing is more prominent in younger and older marathoners compared to the other age groups of marathoners) [12] or sex (e. g., men tend to opt more for a “risk” strategy by starting out at fast speeds and then modulating or slowing down afterwards, whereas women tend to err on the side of caution) [13]. Athletes and coaches need to be aware of these parameters and should focus on developing appropriate training programs, with particular emphasis on setting speeds for tempo runs and building competitive or optimal pace strategies to optimize performance [14] [15] [16] [17] [18]. For these reasons, the ability to predict marathon performance can be of great interest in the calibration of training sessions and the definition of the athlete’s potential speed limits in order to achieve the best performance.

The relationships between running distance (or speed) and time have long been used for this purpose [19] [20] [21]. Several studies have sought to predict long-distance running performances for events like the marathon with mathematical models (e. g., logarithmic, hyperbolic, exponential, multiple regression models, etc.) [18] [22] [23] [24], including concepts of critical speed [25] [26] or power laws [25] [27], and machine learning algorithms (i. e., artificial intelligence: AI) [14] [16] [17].

Recently, there has been growing interest in machine learning algorithms, notably with supervised learning, one of the intelligent methodologies that have shown promising results in the prediction of continuous variables in many areas such as weather [28], health [29] and sports [30]. The literature indicates that sport is one of the expanding areas requiring good predictive accuracy [16] [30] [31]. However, although machine learning regression models like artificial neural networks (ANN) [29] [31] [32] [33] or k-nearest neighbors (KNN) [14] have been used to predict performances in some sports activities, the validity and accurate prediction of individual or team performances using AI merits further exploration [34] [35] [36] [37].

ANN is a powerful black-box supervised learning algorithm capable of producing nonlinear input-output mapping [34] [38] [39]. The model consists of one input layer, one or more hidden layers and one output layer. The interconnected components (i. e., neurons) transform a set of inputs into a desired output [38] [39]. The accuracy of this type of model is typically improved by using additional data (i. e., weights associated with interconnected components are continuously changing) during the ANN training process [30] [38]. On the other hand, the KNN model uses one of the simplest types of supervised machine learning algorithms based on learning by analogy, that is, by comparing a given test example with training examples that are similar to it [29] [38]. The basic KNN algorithm has two steps: find the k training examples that are closest (“closeness” is defined in terms of a distance metric, such as the Euclidean distance) to the unseen example and take the average of these k label values [34] [38]. This machine learning model is also noted for not requiring learning (i. e., the computation of the algorithm occurs during runtime) as it memorizes the training dataset [34] [36] [38]. Moreover, it seems that compared to ANN, KNN tends to perform better on datasets with a small number of samples and has less risk of overfitting [40].

Although studies have focused on the use of machine learning algorithms (bagging, local matrix completion, etc.) to predict marathon performance [14] [17] [22] and future slowdowns during the race [16], no study to the best of our knowledge has used and validated ANN and KNN supervised machine learning in this running discipline and compared the accuracy (i. e., the nearness of the actual performance to the predicted performance, and thus a lower mean absolute error or a lower bias, meaning higher accuracy) or precision (i. e., the closeness of the predicted performances, and thus a smaller distance between limits of agreement, meaning higher precision).

The objectives of the current study were therefore to test the validity of two supervised machine learning methods (ANN and KNN) and to compare the accuracy and precision of the marathon performance predictions to determine which one performed best. Based on the literature and our data, we believe that both artificial intelligence techniques will be valid, and that KNN will be the better performing method.


Materials and Methods

Experimental approach

All French official rankings of the French Athletics Federation (FFA for Fédération Française d’Athlétisme) for the 10-km road race (n=217,669) and the marathon (n=92,813), both performed in 2019, were retrospectively analyzed. In France, the marathon is not open to younger categories of athletes, so only athletes over the age of 21 years were selected for both races (n=201,990 on the 10-km and n=92,813 on the marathon). If the athletes had not self-reported their body mass and/or height, they were removed from the analysis. Thus, 7,716 athletes with a 10-km performance, and 4,130 in the marathon were included. Then, only those athletes (women and men) who performed the 10-km and the marathon in the same year (i. e., 2019) were retained. Thus, 1,728 performances were collected. However, as the aim was to predict marathon performance based on 10-km road performance, athletes who ran a marathon before their 10-km were removed (n=833). Moreover, athletes who maintained a higher speed in the marathon than in the 10-km race were also eliminated (n=11). Finally, among the 884 remaining athletes, those with a performance in the 10-km below the lowest ranking of the FFA were eliminated (i. e., performance>50 and 60 min, respectively, for men and women).

This study was approved by the National Ethics Committee for Research in Sports Sciences (CERSTAPS2019220231) [41]. The protocol for this study was legally declared, in accordance with the European General Data Protection Regulations.


Participants

The analysis was thus performed with a dataset of 820 athletes. For each athlete, the sex (i. e., female vs male), date of birth (to calculate age), body mass and height (to calculate the body mass index: BMI), and race times (i. e., the performances on the 10-km and marathon) were recorded.


Procedures & data treatment

Two supervised machines learning regression algorithms were used: ANN and KNN [30] [31] [37] [38] [39]. Both algorithms were implemented in R language. R software (version R X64 3,6,1 – R Development Core Team, Vienna, Austria) was used for our analysis, and the following R packages for machine learning approaches were used: dplyr version 0,8,3, neural net version 1,44,2 [42] [43].

All data were normalized to meet the requirement of the sigmoid transfer function (for ANN) and to remove the scale differences between the input variables (for KNN).

The data from the 820 athletes were randomly separated into a random train/test split for training and testing processes, respectively. The 90:10 ratio was used, meaning that 90% of the data was randomly selected for the training process (i. e., 738 out of the 820 performances), while 10% of the data was randomly selected for the testing process (i. e., 82 performances) [44] [45].

For both supervised machine learning algorithms, the same inputs (10-km race time, BMI, age and sex) were used to solve the linear regression problem, which consisted in estimating the value of the same continuous output (marathon race time). We also specified that exactly the same training (n=738) and testing (n=82) data were used for the two algorithms.

In ANN, a multilayer perceptron was used with four inputs and one output ([Fig. 1]) [37]. In this network, the computing units are arranged into three layers, which are conveniently ordered. The information flows forward from the four neurons of the input layer to the two connecting neurons of the hidden layer and, finally, to the single neuron of the output layer using no backward connection. The first layer (the input layer) corresponds to the independent variables (i. e., performance on 10-km, BMI, age and sex), while the third layer (the output layer) corresponds to the dependent variable score (marathon performance). The intermediate layer, which is the hidden layer, consists of all possible connections between the input and output layers and allows for the combined impact of a multiple set of independent variables on the output layer. This ANN makes use of Rprop, which is short for resilient back propagation, a training technique without weight backtracking for supervised learning, [43]. The training stopping point (i. e., threshold) was set at 0,01.

Zoom
Fig. 1 Neural network architecture.; Note: BMI: body mass index.

To compare with the neural network algorithm, KNN was applied using the same four input variables and the same output variable for comparison with ANN. In this study, the KNN algorithm was tested by selecting the closest neighbors (k=3). In other words, for each athlete of the testing dataset, we retained the three athletes of the training dataset having the smallest Euclidean distance (sum of the differences between their four respective inputs). The estimated output (marathon time) was calculated using the average of the marathon times of the three closest neighbors (athletes) weighted by the inverse of their respective Euclidean distances to the testing athlete.


Statistical analysis

Mean values and standard deviation (SD) of variables were calculated.

The Shapiro-Wilk test was used to test whether the data followed a normal (Gaussian) distribution. A Student paired samples t-test was used for normally distributed data to compare the actual and predicted marathon performances for each machine learning algorithm. When these data did not pass the test for normality, a Wilcoxon signed-rank test was used. The magnitude of the differences was assessed by the effect size (ES), which was classified using the Cohen scale [46].

The association between the actual and predicted performances was tested with Pearson’s product-moment or Spearman’s rank order correlations, depending on whether the data followed a normal Gaussian distribution. We considered a correlation of r=0,90 or more as very high, between 0,70 and 0,89 as high, between 0,50 and 0,69 as moderate, and between 0,26 and 0,49 as low [47].

The coefficient of determination (r 2 ) and the mean absolute error (MAE) criteria based on a common 90:10 training/test data split were chosen to evaluate the numerical fit of the output from the ANN and KNN models. MAE is the average of the absolute errors, with lower error values typically meaning the model is more accurate and the predictions closely match the actual values [14] [29]. The r 2 value determines the precision of the predictions and how well the model fits the data [48]. This also makes it easier to compare and evaluate the results [31] [45].

Moreover, the bias (i. e., difference between actual and predicted performances, to access accuracy) and 95% limits of agreement (95% LoA, i. e.,±1,96 SD, to access precision) were computed according to the Bland-Altman method.

Finally, the KNN and ANN models were compared to determine which one performed better. The outputs of the predicted marathon performances were verified with actual marathon performances. The model was considered valid if the MAE was less than 5% and if the biases and approval limits were acceptable. The best model was selected on that which had the lowest MAE and the highest accuracy (from bias, i. e., the closeness of the actual performance to the predicted performance, and thus a lower average absolute error or bias, thus means a higher accuracy) and precision (from LoA, i. e., the proximity of the predicted performance, and thus a smaller distance between LoAs, means a higher accuracy) [49].

The level of statistical significance was set at p<0,05, and all analyses were performed with the Statistical Package for the Social Sciences (SPSS, release 20,0, Chicago, IL, USA).



Results

Means and SD of actual and predicted marathon performances are presented in [Table 1].

Table 1 Mean values and standard deviation (SD) of actual and predicted performances on marathon from each algorithm (i. e., artificial neuronal network (ANN) and k-nearest neighbors (KNN)) in 82 performances athletes (i. e., 10% of the data is selected for testing process), difference between actual and predicted performances (p), magnitude of the difference, Pearson’s product-moment with actual performance (r), bias and SD (min), bias and 95% LoA (min and %) and mean absolute error (MAE) (%).

Distance

Mean performance (min)±SD

p

Magnitude of difference

Correlation

Bias (min)±95% LoA

Bias (%)±95% LoA

MAE (%)

ES

Interpretation

r

Interpretation

Marathon

Actual

199,75±36,01

Predicted from the ANN

202,77±30,24

0,063

−0,091

Trivial

0,918*

Very high

3,023±28,492

1,5±14,1

5,6

Predicted from the KNN

198,96±32,74

0,333

−0,024

Trivial

0,982*

Very high

-0,79±14,35

-0,4±7,2

2,4

Note: Bias=difference between actual and predicted performances; r=coefficient of correlation; LoA=limits of agreement; ES=effect size. *Significantly correlated at p<0,001.

The ANN-equation to estimate marathon performance from performance on 10-km, BMI, age and sex is depicted in [Table 2] and [Fig. 2].

Zoom
Fig. 2 Neural network architecture with computational details.; Note: BMI: body mass index.

Table 2 Syntax (Excel spreadsheet) of the artificial neural network-based equation to estimate marathon performance (min) from performance on 10-km, BMI, age and sex. Mean values and standard deviation (SD) of input variables from artificial neuronal network (ANN) algorithm.

Marathon Performance=(((1/(1+EXP(-((((C3–40,734675)/5,886275)*(0,1717))+(((D3–21,518491)/2,061137)*(-1,11518))+(((E3–43,432927)/9,513137)*(0,28333))+(((F3–1,8304878)/0,3754327)*(0,95911)) +(0,68255)))))*(-1,5208))+((1/(1+EXP(-((((C3–40,734675)/5,886275)*(-0,78176))+(((D3–21,518491)/2,061137)*(0,21688))+(((E3–43,432927)/9,513137)*(-0,05194))+(((F3–1,8304878)/0,3754327)*(-0,30595)) +(-0,06117)))))*(-4,98486))+(3,41225))*37,77462+204,8372

Input variables

Mean

SD

10 -km time (min)

40,734675

5,886275

BMI (kg.m-2)

21,518491

2,061137

Age (years)

43,432927

9,513137

Sex

1,8304878

0,3754327

Marathon time (min)

204,83720

37,77462

Note: C3=performance on 10-km; D3=BMI (body mass index); E3=age in years; F3=sex (girls=1; boys=2).

No statistically significant difference was found between the actual and predicted performances for either algorithm (p>0,05, [Table 1]). Moreover, the magnitude of the bias in these predicted performances was systematically trivial (ES≤-0,091; [Table 1]).

All predicted running performances were correlated with the actual ones, with a very high correlation coefficient (p<0,001, r≥0,918, [Table 1]).

The MAE, presented in [Table 1], were 11 min 16 s (i. e., 5,6%) and 4 min 48 s (i. e., 2,4%) for the ANN and KNN, respectively.

The bias±95% LoA are shown in [Figs. 3]–[4]. The bias±95% LoA of the ANN and KNN were 3 min 14 s±28 min 30 s (i. e., 1,5±14,1%) and -47 s±14 min 21 s (i. e., -0,4±7,2%), respectively ([Table 1] and [Figs. 3] [4]).

Zoom
Fig. 3 Validity of measurements with the ANN algorithm to predict performance.; Top panel: association between actual and predicted performance from the ANN algorithm in the marathon in 82 athletes. The solid line is the linear regression. r 2 is the coefficient of determination. Bottom panel: Bland and Altman plots for the comparison between actual and predicted performance in the marathon in 82 athletes. Dashed line is the bias, solid lines are the 95% limits of agreement.
Zoom
Fig. 4 Validity of measurements with the KNN algorithm to predict performance.; Top panel: association between actual and predicted performance from the KNN algorithm in the marathon in 82 athletes. The solid line is the linear regression. r 2 is the coefficient of determination. Bottom panel: Bland and Altman plots for the comparison between actual and predicted performance in the marathon in 82 athletes. Dashed line is the bias, solid lines are the 95% limits of agreement.

Discussion

The objectives of this study were to test the validity of two supervised machine learning algorithms (ANN and KNN) and to compare them to determine which one was better at predicting marathon performances in terms of accuracy (from the MAE and the bias) and precision (from 95% LoA). We have hypothesized that both techniques would be valid, and that KNN would be the better performing method. In view of the results obtained and our dataset, this hypothesis appears to be confirmed.

One of the main findings was that the two algorithms can indeed be considered valid, accurate (i. e., a lower bias means a higher accuracy) and precise (i. e., a lower distance between limits of agreement means a higher precision) for predicting marathon performances, as all the results confirmed their validity for predicting performances from independent variables (i. e., performance on 10-km, body mass index, age and sex), with both of them demonstrating a prediction accuracy above 94%. These results were comparable to those of another study [14] that showed 97% accuracy in elite runners with a local matrix completion machine learning technique.

The MAE was lower with KNN than with ANN, meaning that KNN was more accurate (up to 98% instead of 94%) and the predictions matched more closely to actual performances. The fields of application differed (i. e., classification algorithm), and it should be noted that the results were in accordance with Mustafa et al. [50] and Tamilarasi and Porkodi [51], who showed that KNN demonstrated better accuracy than ANN, but in disagreement with those of Peace et al. [38], Musa et al. [52] and Anyama et al. [53] [54], who obtained better predictions with ANN than KNN. Regardless of the domain, there is no real consensus on which algorithm/model is best in terms of regression or classification, and each model has advantages and limitations [29] [32] [55] [56]. Indeed, compared to ANN, KNN tends to perform better on datasets with a small number of samples and has less risk of overfitting. Therefore, even though the data sample size (i. e., 820 athletes) was relatively large in this study, the ANN might have exhibited some limitations (i. e., generalizability, or the risk of overfitting the data), which would explain the greater accuracy of KNN in predicting marathon performances. In other words, the prediction results may be partly explained not only by the breadth of the data (e. g., size of the dataset [40]) but also by the model parameters (e. g., ratio used for training and testing datasets, number of hidden layers or learning rate for training a neural network, k in KNN, distance type in KNN, etc.) and the practitioner’s method (i. e., the modeling procedure [30] [38] [39] [57], which defines the algorithms).

Moreover, the results revealed that the respective coefficients of correlation were, respectively, 0,918 and 0,982 for ANN and KNN and also that in 95% of the dependent variable score (i. e., marathon performance), the biases from ANN and KNN were 3 min 14 s±28 min 30 s (i. e., 1,5±14,1%) and − 47 s±14 min 21 s (i. e., − 0,4±7,2%). Therefore, the coefficient of correlation was slightly higher (i. e., higher prediction precision) for KNN than ANN, and the bias and 95% LoA were lower with KNN (i. e., higher accuracy).

Coquart et al. [15] showed that marathon performance could be predicted from the nomogram of Mercier et al. [58] with acceptable levels of accuracy and precision. Indeed, the authors found a bias and 95% LoA of −1 min 25 s±27 min 3 s (i. e., − 0,7±13,2%) [15], which is comparable to ANN in the current study. However, for the same level of accuracy and precision, Coquart et al. [15] had the athletes perform two long-distance maximal performances (i. e., 10-km and 20-km), while the predictions from ANN (and KNN, which is more accurate and precise) were obtained from only one performance (i. e., 10-km). More recently, Vickers and Vertosick [18] explored several methods (i. e., the Riegel formula [59] and two models based on one or two prior races, respectively) for predicting the marathon race times of recreational runners. Between the predicted and observed marathon times, they found a mean square error (MSE) of 6 min 21 s for the Riegel formula [59], 3 min 48 s for the model based on one prior race, and 3 min 28 s for the model based on two prior races. It thus seems that in addition to the number of prior performances a model includes to predict performance, the association of certain variables (e. g., sex, age, BMI) with race velocity also improves the accuracy of predictions. Moreover, it is especially interesting to note that in the current study, which integrated similar factors in the algorithms (i. e., sex, age, BMI), the MSEs were lower, with an MSE of 2 min 39 s for ANN and an MSE of 39 s for KNN, by taking only one performance (i. e., 10-km time). Therefore, it might be interesting to add another performance (e. g., on the half-marathon) to ANN and KNN to see whether prediction accuracy/precision is significantly better than prediction methods not relying on AI. To limit athlete fatigue, ANN and especially KNN should nevertheless be preferred over other prediction methods that require two performances.

The main potential limitations of this study were the sample size and the content of our dataset. As mentioned in the discussion, KNN has the advantage of performing better on small data samples (less risk of overfitting) than the ANN algorithm. Moreover, the choice and inclusion of input variables (limited, with only four variables, i. e., 10-km performance, BMI, age and sex) determined the algorithms and influenced the prediction results. In other words, the possibility of adjusting the algorithm to better model the problem domain will always be conceivable. Indeed, although marathon performance can be affected by a multitude of factors since the run involves several performance elements, physiological (e. g., maximal oxygen uptake, running economy, anaerobic threshold, etc.) [22], psychological (e. g., motivation, stress) and environmental variables can also be determinant in long-distance running performance [4] [9] [10]. Thus, the other limitation of this study concerned the quality of the recovered data. Height and body mass were not measured, with the athletes only reporting their anthropometric data (i. e., self-declaration of body height and mass to calculate the athletes’ BMI). However, runners are known to self-report accurately for this type of data [60]. Moreover, it is likely that the performance data (i. e., 10-km time) was influenced by the race profile (e. g., course profile: uphill, downhill, etc.), meteorological conditions (e. g., temperature, wind, rain, etc.), opposition field or race strategy (e. g., run to win or run for a time), but this information was not available.

Practical applications

The applications from the current study would be further extended by performing validation studies in race (i. e., middle- or long-distance running on track or road), with other inputs that influence running performance (e. g., maximal oxygen uptake, running economy and anaerobic threshold) in specific levels of runners (i. e., subregional, regional, inter-regional, national and international). It would also be interesting to use machine learning techniques to identify the determinant factors of the marathon (i. e., the use of classification techniques like random forest or naive Bayes) in order to help athletes, staff and professional sport analysts to design training programs and detect athletic talent, in addition to predicting results (marathon time).



Conclusions

Few studies on the prediction of running performance, especially for the marathon, have used artificial intelligence with ANN and/or KNN algorithms. The results of the current study demonstrated that both KNN and ANN were able to predict the performances of marathon runners with an acceptable level of accuracy. Both models were valid and able to attain a prediction accuracy above 94%, although KNN appears to be superior to ANN as it accurately predicted marathon performance above 98%. These approaches can therefore be used to predict performances over the course of appropriate training programs, with particular emphasis on prescribing speeds for tempo runs and determining competitive strategies. Future studies should be directed toward the use of machine learning techniques to gain insight into other parameters that impact marathon performance by means of classification techniques in order to detect talented athletes, for example, and not only to predict marathon performance.



Conflict of Interest

The authors declare that they have no conflict of interest.

Acknowledgements

The authors are grateful to the French Athletics Federation (Fédération Française d’Athlétisme) for data collection, and the Orthodynamica Center at Mathilde Hospital 2.


Correspondence

Miss Lucie Lerebourg
Université de Rouen
STAPS, CETAPS
Boulevard Siegfried
76821 Mont-Saint-Aignan
France   
Phone: 02 32 10 77 93   
Fax: 02 32 10 77 93   

Publication History

Received: 07 July 2022

Accepted: 05 December 2022

Accepted Manuscript online:
06 December 2022

Article published online:
17 February 2023

© 2023. Thieme. All rights reserved.

Georg Thieme Verlag
Rüdigerstraße 14, 70469 Stuttgart, Germany


Zoom
Fig. 1 Neural network architecture.; Note: BMI: body mass index.
Zoom
Fig. 2 Neural network architecture with computational details.; Note: BMI: body mass index.
Zoom
Fig. 3 Validity of measurements with the ANN algorithm to predict performance.; Top panel: association between actual and predicted performance from the ANN algorithm in the marathon in 82 athletes. The solid line is the linear regression. r 2 is the coefficient of determination. Bottom panel: Bland and Altman plots for the comparison between actual and predicted performance in the marathon in 82 athletes. Dashed line is the bias, solid lines are the 95% limits of agreement.
Zoom
Fig. 4 Validity of measurements with the KNN algorithm to predict performance.; Top panel: association between actual and predicted performance from the KNN algorithm in the marathon in 82 athletes. The solid line is the linear regression. r 2 is the coefficient of determination. Bottom panel: Bland and Altman plots for the comparison between actual and predicted performance in the marathon in 82 athletes. Dashed line is the bias, solid lines are the 95% limits of agreement.