Individual students are referred to the open reading areas and the quiet reading rooms. Group rooms can be booked up to 7 days in advance and for a maximum of 6 hours divided over a maximum of 2 different bookings.

The tablet PC application with screenshot achieved the best rating with respect to the median comparison not significant.

Values within the third study varies between 3. They are computed as a mean of different subjective ratings and a higher value means a high task load. The first study did not yield any significant results and the Shapiro-Wilk test was not significant as well.

The highest median was reported for the head-mounted display with text instructions, the lowest median have been measured for the tablet PC with text instruction. The post-hoc test revealed significant lower values for the paper based instruction 2P in comparison to the head mounted device with audio 2HA and both spatial awareness applications with text 2ST and audio 2SA. The results of the third study with respect to the task load are not significant. The contain values between 5. Fifty students of the field of study human computer systems 18 and media communication 32 have been recruited, consisting of 27 females and 23 males with an average age of 21 years.

No participant has ever performed a similar repair task before or had been accustomed to the experiment workflow. Although none of the factor comparisons was significant, the application using head-mounted display especially the one with text instructions had a higher task duration than the paper instructions, whereas the task completion time with the application with a tablet PC was shorter. With respect to the error values, both HMD with audio and tablet with text resulted in more error than the paper instructions, whereas the HMD with text and the tablet with audio instructions had lower values.

The use of the tablet-based version with text instructions was rated slightly better regarding usability and task load and in the qualitative results, although we could not find a significant confirmation. Notably, the participants had very different mechanics foreknowledge. Fifty technical apprentices have been recruited, 4 of which were female.

All participants were acquainted to similar repair tasks and perform them on a regular basis. They have not been familiar with AR before. The study showed a significant superiority of the paper instructions compared to the AR applications, the participants needed less time in all comparisons. Additionally, the paper instructions provided significant less task load compared to all other applications except the HMD with text representation.

The total amount of errors is very low and does not show any significant differences. With respect to usability, both variants of the head-mounted display have been accessed with significant lower usability. Also the qualitative feedback showed that the apprentices are used to work with paper instructions and the task is relatively easy for them.

They did need too much guidance and experienced the Augmented Reality applications as less efficient. The average age was Ninety-two percent of the participants said that they need to solve mechanic tasks on a daily basis and spend approximately 23 h per week with those tasks. The individual results of this study have already been published in Aschenbrenner et al. Regarding task duration, the projection-based SAR 3S was significantly better than all other conditions.

With regards to usability, the head-mounted display application significantly underperformed the SAR condition as well as tablet with tracking and with screenshots. Also the phone condition significantly underperformed the latter two tablet conditions. Neither the analysis of the errors nor of the task load was significant. Summarizing the results of Aschenbrenner et al. A valid question is, whether the three studies can be compared at all. Study 1 clearly has a completely different user group, and whereas study 2 used a single-user setting, study 3 was conducted within a collaborative setting.

Thus, any results from the following analysis must take into account, that these facts will eventually superimpose any findings of differences between devices and implementations. Because of this reason, the authors refrained from conducting a device-specific comparison for example all HMD condition in comparison to all tablet PC conditions.

We calculated a post-hoc comparison with Bonferroni correction, if the ANOVA test of the comparison of the factor over all three studies was significant. For example, a comparison of the factor task duration between the paper instructions of the second experiment and the phone condition of the third experiment can be found in the row 2P and the column 3P in Figure If the corresponding box-plots from section 4. A table comparing the measured task duration of all conditions.

As expected from Figures 9 , 13 confirms, that the first experiment first 5 rows took significantly more time for the tasks than the other two experiments, except for the head-mounted display conditions in the second experiment—although the student participants using the same HMD applications 1HA and 1HT still underperformed compared to the apprentices 2HA and 2HT.

As seen in Figure 9 , the variance of the task duration is very high for the first experiment. The reason for both findings is probably the lack of domain knowledge of the student participants which led to the decision to repeat the experiment with domain experts.

The baseline condition for the first and second experiment is the paper instruction 1P and 2P , and in case of the third experiment the phone condition 3P. The significant results of the second experiment are visible in the overall-comparison: the paper condition 2P outperforms all of the other second experiment conditions. Furthermore, 2P task duration is significantly shorter than all of the first experiment, including the paper instructions.

Finally, this condition 2P took significantly less time than the baseline of the third experiment 3P , which means that the task takes longer with phone support without shared visual context than with paper instructions, which could be expected due to grounding effort. The head-mounted display with text used by the apprentices took significant longer than all of the third experiment conditions except the phone condition 3P. The SAR application 3S had a significant shorter task duration than all of the other conditions except 2P.

Errors have been specified by domain experts. As it can be seen in the boxplot in Figure 10 , the student participants made more errors than the domain experts. Although both experiments with apprentices show a very low error rate, the head-mounted display led to more errors than the other conditions. In the collaborative setting, some errors are omitted anyways, as the expert will correct the working person, if he or she is committing a visual error, for example plugging cables in the wrong intake.

Still, proper mounting so that no cable or plug is loose cannot be controlled remotely. As already mentioned above, the student participants in the first study made a lot of errors. This is confirmed in Figure 14 , where all conditions of the study 1 resulted in significantly more errors. A table comparing the measured error amount of all conditions.

The latter resulted in significantly less errors. At first, already Figure 11 showed a clear difference between both user groups, which is now confirmed in Figure The students in the first experiment had a larger variance in this subjective measure, which can also be due to the lack of domain knowledge. Furthermore, the values in study 1 are clearly lower than the other two experiments except for the HMD conditions in study 2. It is interesting, that the perceived usability of the head mounted display in the second study 2HA and 2HT was rated significantly lower than all of the other conditions in study 2 and also in study 3.

If we regard the plots at Figure 11 , the QUESI values of the HMD conditions for the first and second experiment is comparable, although both user groups are different. The highest values were achieved by the paper instruction condition in the second study 2P and the tablet screenshot 3TS condition in the collaborative setting.

The control condition of the second experiment 2P was rated with a higher usability than the phone condition 3P and the collaborative HMD condition 3P. The screenshot variant of the third experiment 3TS yields significantly higher ergo better values than phone condition 3P and the HMD condition 3H. Again we can clearly see that the first experiment participants perceived a higher task load than the other participants, but in general the range of the task load is widespread.

The results from the post-hoc comparison of the task load values depicted in Figure 16 show still some interesting results, although a lot of comparisons did not yield significant results. At first, the task load of the paper instructions in the second experiment 2P is significantly lower than the results of the first experiments.

click here It also is significantly lower than the other conditions of experiment 2 except tablet with text 2TT. Additionally, it still outperforms all of the conditions of the third experiment but the screenshot condition 3TS and the SAR condition 3S. The highest values, and thus the highest perceived task load, were achieved by the head-mounted display condition with text in the first experiment 1HT. Figures 17 — 20 show a comparison of the four main factors between all experiments. Furthermore, for each comparison, an ANOVA test has been computed, which was significant in all cases.