ia64/lib/copy_user.S

8  * the boundary. When reading from user space we must catch
9  * faults on loads. When writing to user space we must catch
11  * we don't need to worry about overlapping regions.
27  *	- handle the case where we have more than 16 bytes and the alignment
39 #define COPY_BREAK	16	// we do byte copy below (must be >=16)
111 	// Now we do the byte by byte loop with software pipeline
128 	// At this point we know we have more than 16 bytes to copy
133 	// The basic idea is that we copy byte-by-byte at the head so
134 	// that we can reach 8-byte alignment for both src1 and dst1.
153 	// Optimization. If dst1 is 8-byte aligned (quite common), we don't need
155 	// We know src1 is not 8-byte aligned in this case.
179 	// For the case p14, we don't need to copy the shifted part to
216 	// we have more than 8 bytes to copy.
230 	// because we need 2 back-to-back val1[] to get tmp.
260 	// specifying the bits to shift, we need to provide 7 cases
298 	// To fix that, we simply copy the tail byte by byte.
328 	// At this point we know we have more than 16 bytes to copy
330 	// which may not be the one we want. So for now we must move
331 	// forward slowly until we reach 16byte alignment: no need to
343 	// Stop bit not required after ld4 because if we fail on ld4
344 	// we have never executed the ld1, therefore st1 is not executed.
352 	// Stop bit not required after ld8 because if we fail on ld8
353 	// we have never executed the ld2, therefore st2 is not executed.
367 (p7)	br.cond.dpnt .dotail		// we have less than 16 bytes left
387 	// No matter where we come from (loop or test) the src1 pointer
388 	// is 16 byte aligned AND we have less than 16 bytes to copy.
416 	// Here we handle the case where the byte by byte copy fails
433 	//	- we don't disrupt the pipeline, i.e. data in transit in
435 	//	  We simply replace the load with a simple mov and keep the
436 	//	  pipeline going. We can't really do this inline because
453 	// when we copy the head. We need to finish the pipeline and copy
455 	// at the top we still need to fill the body and tail.
468 	// Here we handle the head & tail part when we check for alignment.
493 	// Here we suppose Page A exists and Page B does not.
495 	// As we move towards eight byte alignment we may encounter faults.
506 	// worry about st2, st4. There we can use the information encapsulated
521 	// Here we now we have less than 16 bytes AND we are either 8 or 16 byte
525 	// This means that we either:
531 	// This allows us to assume that if we fail on a load we haven't possibly
532 	// executed any of the previous (tail) ones, so we don't need to do
533 	// any stores. For instance, if we fail on ld2, this means we had
534 	// 2 or 3 bytes left to copy and we did not execute the ld8 nor ld4.
536 	// This means that we are in a situation similar the a fault in the
543 	// we know that ret0 can never be zero at this point
544 	// because we failed why trying to do a load, i.e. there is still
563 	// Here we simply restart the loop but instead
564 	// of doing loads we fill the pipeline with zeroes
565 	// We can't simply store r0 because we may have valid
569 	// we MUST use src1/endsrc here and not dst1/enddst because
582 	cmp.ne p6,p0=dst1,enddst	// Do we need to finish the tail ?
593 	cmp.ne p6,p0=dst1,enddst	// Do we need to finish the tail ?