HIS-HMI WS05
|
After a first introduction the students have run a first observation
experiment (cf. Exercise
Nr.1 ). Here we will discuss the results of the
experiment.
Cigaret Machine - Ease of Use
In one experiment a student has observed the handling of 5 different
cigaret machines. Every machine M had its own user interface (UI) which
could be characterized by some observable properties (P). The user has
tried every machine and after his trial he had to express his 'feeling
about the handling' with a number (n) between 1 and 10. The number was
interpreted as a 'degree of the Ease of Use'. '1' was a label for
the smallest degree of Ease of Use and '10' was interpreted as
the highest.
With regard to the presupposed theoretical framwork of measuring we can say that the target object (TO) were the feeling of the subject correlated with a sequence of interactions with the user interface of the cigaret machine. The reference object (RO) was a finite set of numerals [1,10] with the interpretation, that these numerals represent the beginning of the sequence of integers and that therefore the numerals can be handled like integers, also with regard to the ordering relation '<'. Thus it is assumed that the subject of an experiment is interacting with the usert interface and while he is perceiving this user interface he has certain feelings which are interpreted as representations of the theoretical term 'Ease of Use'. It is furthermore assumed that these feelings have different kinds of 'intensities' which can be related by the subject to a numeral as part of a finite set, which can be ordered with regard to the represented intensities. Thus, if the subject 'feels very bad' during a sequence of interactions then he can label this feeling e.g. with the numerl '3'. If he feels 'very good' he can perhaps label this feeling with the numeral '9'. Empirically the subject will rate the label '9' 'higher' than the label '3', i.e. 3 <e 9. With such a procedure it is possible to rank order all the tested machines with regard to the term 'Ease of Use'.
Figure: Cigaret Machine; Property 'Easy to
Use' |
Some questions can be raised here:
How 'stable' is this measurement procedure with regard to the
differences between the labels 1 to 10? Would less numbers be more
accurate?
How stable is this measurement procedure between different subjects? Would the same task with the same machine produce labels which are 'similar' within a deviation of x?
In a next step one can use these measurements and one can try to find some 'hints' which of the observable properties {P1, ..., Pk} of a user interface are responsible for 'bad labels' and which for 'good labels'. This can be reached e.g. by 'isolating' those properties which are 'exclusive' for the machines with the bad labels.
In another experiment a student has observed four different subjects interacting with one and the same ticket machine. The target object (TO) here was the duration of the interaction between the beginning of a task and the end for each subject. The reference object (RO) was a clock with usual time units. The observation showed that different subjects with the same machine can produce different measured labels. The experimenter has assumed that the labels (= numerals) can be interpreted as rational numbers representing the time points of a clock which can be ordered by an ordering relation '<'. Thus a 'lower' numeral a compared to a 'higher' numeral b can be interpreted in th sense that 'b' represents a longer duration than 'a'. Duration is used by the experimenter to indicate some 'degree of intuitionality': small duration is interpreted for a high intuitionality and long durations as indicating a bad intuitionality. Thus the measured labels allow then an ordering of the subjects according to the different durations of their interactions.
Figure: Ticket Machine - Intuitional |
This second experiment shows that it can be necessary for the
evaluation of a machine M to classify the test subjects according to
their individual ability to have an intuition for the usage of a
machine. If one has 5 different subjects S1, ..., S5 which show
for a certain machine M1 totally different values, then it would be not
possible to evaluate other machines M2, ..., Mk with these different
subjects. On account of differences between the subjects themselves it
would be difficult to interprete these differences in relation to the
machines under investigation.
This introduces some cases to be tested:
Select through a series of pre-tests groups of subjects (functioning as Equivalence Classes) which show 'similar' results with the same tasks and user interfaces.
Evaluate machines only with regard to pre-tested classes of subjects
Use the measured differences between machines to get hints which kinds of properties are responsible for 'bad Labels'
Try to characterize the pre-tested classes of subjects according
to important properties.
Evaluate again machines only with regard to pre-tested classes of subjects, but in this case with classes of subjects which are characterized by tested properties.
Use the measured differences between subjects, which are charactrized by properties, to get hints which kind of human properties can be related to bad labels during interactions with a machine.