Understanding P4080 startup sequence

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Understanding P4080 startup sequence

4,224 Views
pegasus711_knig
Contributor II

Hello People.

Ive been trying to understand how the P4080 starts up and what things need to be taken care of. Hence I am reading the u-boot source code as well as taking a look at the /arch/powerpc/ directory in the kernel (I am looking at 2.6.27 and 2.6.34 here)

Starting with 2.6.27. So now lets get started to give you all an idea of what am I looking at:

  1. Under /arch/powerpc/kernel/ I see head_fsl_booke.S which seems to be the first file that the bootloader dives into. Am I right?
  2. There are some CPU setup functions in the above directory for the 44x series of processors (IBM I believe) as well as the PPC970 but there seems to be none for the e500. So I am guessing the entire CPU setup is happening in the head_fsl_booke.S file. Is it? Where can I get to read the boot startup requirements for freescale devices? Will docs AN3542 and AN3647 suffice or are there other docs that could help? There is a generic app note AN1809 but it seems to be for PPC970 and 'similar devices'. I don't think the P4080(e500) counts as one :smileyhappy:
  3. Then Ive been looking at the u-boot code in parallel and it lies here arch/powerpc/cpu/mpc85xx/start.S
  4. I see that there is a resetvec.S file which branches off unconditionally to _start_e500 which is a section defined in the above file. Now pardon me for being naive, but what exactly does the line .section .bootpg, "ax" tell us? As far as I remember it is naming a certain section as bootpg and that it should be allocatable and executable. Executable I get it, but could folks kindly refresh my memory as why should such a section be allocatable? To allow itself to be re-allocated depending on the memory map of the board?
  5. Now I a assuming the u-boot has already done many an initialization stuff before handling it over to Linux. Yet I see many similar stuff being done in head_fsl_booke.S. I haven't examined the gritty details of both of them yet but would be great if some one sums the difference in the work done by u-boot and the Linux start up code.
  6. If I would like to run legacy applications on this system which deal with a 32byte cache line size, I may have to toggle between DCBZ and DCBZL with the latter used to clear the entire default cache line whereas the former clearing only the first 32 bytes (can be done by setting the DCBZ compat bit in L1CSR0 register). Now under the Linux source tree under  /arch/powerpc/kernel/ I would like to identify the affected areas. On searching the linux cross reference, I see that those files are
    1. /kernel/align.c
    2. /lib/copy_32.S
    3. /kernel/misc_32.S
    4. /include/asm/auxvec.h
    5. /include/asm/page_64.h
    6. /xmon/ppc-opc.C
    7. /kernel/cpu_setup_ppc970.c
  7. From above list, the last one is irrelevant but the others I need to be doing exactly what? Replace DCBZ with DCBZL right? But before that, do I need to set the compatibility bit in L1CSR0 in head_fsl_booke.S itself?

I know my post has been really long but I hope I have been clear and succinct in describing what my problem is.


Hoping for an answer.


Thanks in advance

Labels (1)
12 Replies

1,963 Views
scottwood
NXP Employee
NXP Employee

1. Yes.

2. Look in arch/powerpc/kernel/cpu_setup_fsl_booke.S.  What specific information are you looking for?  You're correct, AN1809 is not for e500-based chips.

4. I belive allocatable is required for the section to be included in the target image as something that gets loaded to memory.  It's normal for all code/data linker sections, and has nothing to do with the board's memory map or dynamic allocation.

5. Linux tries to not depend too much on what U-Boot has done, as it may change (or be something other than U-Boot).  The power.org ePAPR document describes the interface between the bootloader and the OS (at least for core-related things, not assumptions about I/O state such as memory map initialization, SERDES, etc).

7. dcbzl will work regardless of the setting of DCBZ32.  You do not need to set DCBZ32 first.

Oh, and most people are probably reading this board during normal working hours, so if you post Friday night/Saturday morning and don't see a response by Sunday, that doesn't mean we're ignoring you. :-)

1,963 Views
pegasus711_knig
Contributor II

Hi Scott.

Thanks for the replies. And yes I'm sorry for being, well, just say a little childish :smileyhappy:. Sometimes all you see are deadlines to meet..and one goes into this childlike mode

Nevertheless, as is always the case, the more you see the less you know, the more you find out as you go. Best immortalized by U2 in his 'City of Blinding lights'. On this note, ive got a few more questions coming up your way!!

  1. Ive taken a look at /arch/powerpcMakefile and I see that, in addition to head_fsl_booke.S, head_CONFIG_WORD_SIZE.S is ever present in the list of things to be included at the head of the image: Heres what I mean: Taken from the above mentioned Makefile head-y  := arch/powerpc/kernel/head_$(CONFIG_WORD_SIZE).o. Which means for the P4080, as in my case, CONFIG_WORD_SIZE is 32, I'll always have head_32.S linked along. Could you comment if I am right here? If yes which one appears earlier in the chain? And what would be the need to have both of them? Perhaps head_fsl_booke.S only handles booke specific stuff? If yes wouldn't it make more sense for U-boot to first jump into the generic head_32.S and then this one would then take care of the booke specific stuff? Just my mind wandering about..Please help me understand
  2. I would like to say that understanding the SMP architecture of Linux would be a great asset in really getting a hang of how the P4080 actually boots up as it has 8 e500mc cores. On browsing the core, I saw quite a few structures with names that has ePAPR in it. I believe it is imp to have a brief overview of this ePAPR thingy to get a hang of whats happening. Any pointers as to what I shd read so that I get a 'feel' for it. I dont want to have to read the entire manual. Time is always short for an engineer on the production line. Don't you agree here :smileycool:?
  3. If I would like to add printk like messages, to this early startup code, how do I go about it? I see that early printks are already enabled. But ive seen some code using a ppc_md.progess callback to dump info onto the console..Will I be able to add such statements to the early startup code so that it spits out something to tell me where it is at a given point BEFORE it actually starts the init process. Any specific methods here..esp for the PowerPC (im rather not well versed with this arch as of now, frankly speaking)

And yes as for the reason im still stuck with 27 is because the higher ups want it that way,...and I bet you do know how difficult it is to talk to someone who has lost touch with technology since quite some time now..im sure you do :smileygrin:. Hence Im tagging along..and yes I do not think 27 is that primitive..ive seen people stuck with 2.4 eeven now :smileywink:

Regards.

0 Kudos

1,963 Views
scottwood
NXP Employee
NXP Employee

1. If you look closely at that makefile fragment, you'll see that it's using := instead of the usual +=.  So when you get to head-$(CONFIG_FSL_BOOKE) it replaces head-y rather than adding to it.  head_32.S isn't really generic -- it's specifically for classic PPC chips and wouldn't work on booke.

2. ePAPR can be found here: ePAPR Version 1.1 | Power.org. I assume you're talking about the spin table stuff -- see section 5.5.

3. Make sure CONFIG_EARLY_PRINTK is enabled.  If you still don't see your early output, use JTAG to dump the console buffer (to find the address, do "p &__log_buf" in gdb).

1,963 Views
pegasus711_knig
Contributor II

Hi Scott.

Im a bit late on this thread. Well other things you see. Nevertheless, Ive been reading the ePAPR which looks more like a document which describes the guidelines rather than a description of something existing. More like a POSIX document perhaps?

I also see many references to the term 'Client Program' all the time. Does this client program refer to the bootloader aka the first program to run when the chip is out of reset or does it refer to the program that will eventually run on the chip? Which could be anything from a bare-metal Kind of OS or a full fledged OS like Linux.

Now coming on to the actual document, section 5.3 in ePAPR talks of Initial Mapped Areas(IMA) areas which I believe are mapped by default to certain portions of the memory. Another reference to IMAs was found in the freescale powerpc programming environments manual section 7.3.1.1 called Predefined Physical Memory Locations . Do both these sections in these two documents refer to the same concept?

Here are the predefined physical memory locations

pastedImage_8.png

The freescale document says:

Four areas of the physical memory map have predefined uses. The first 256 bytes of physical memory (or if MSR[IP] = 1, the first 256 bytes of memory located at physical address 0xFFF0_0000) are assigned for arbitrary use by the operating system. The rest of that first page of physical memory defined by the vector base address (determined by MSR[IP]) is used for interrupt vectors or reserved for future interrupt vectors. So it means that when the process comes out of reset (which can be looked at as interrupt) it vectors into memory area 2 which can be either 0x0000_0100 or 0xFFF0_0100. Am I right here? If so, can this memory region be categorized as a Boot IMA as per the ePAPR document?

Now doing a quick scan of 0xFFF0_0000 under the arch/powerpc folder didn't quite turn up anything concrete:

$ grep -iwr 0xFFF00000 .

./boot/dcr.h:#define        EBC_BXCR_BAS                                    0xfff00000

./boot/dts/mgsuvd.dts:          ranges = <0 0xfff00000 0x00004000>;

./kernel/cputable.c:            .pvr_mask               = 0xfff00000,

./kernel/cputable.c:            .pvr_mask               = 0xfff00000,

./platforms/powermac/cache.S: * that we can read from ROM at physical address 0xfff00000.)

./platforms/powermac/pci.c:     reg = ((region->start >> 16) & 0xfff0) | (region->end & 0xfff00000);

./sysdev/fsl_pci.c:                     rsrc_cfg.start = (rsrc_reg.start & 0xfff00000) + 0x8300;

./sysdev/fsl_pci.c:                     rsrc_cfg.start = (rsrc_reg.start & 0xfff00000) + 0x8380;

However as per the u-boot source code, the reset vector address is 0xFFFF_FFFC.  They have a #define within u-boot.lds the loader script for the mpc85xx which explicitly defines the reset vector address to be 0xFFFF_FFFC. This address is not however not mentioned at all in the ePAPR. Ive tried looking into the programming references manual, the e500mc reference manual and the programmer's reference manual but to no avail. How did u-boot arrive at this address? The only reference I could find for this address is in section 6.6 of the e500mc reference manual which is for the TLB states after reset. Is there a memory map for the e500 family of processors that I could find somewhere in these manuals?

Secondly, I see in ePAPR in section 5.4 entitled CPU Entry points requirements that the initial register values must be as such:

pastedImage_12.png

However, when I see head_fsl_booke.S, I see the following comment:

/* As with the other PowerPC ports, it is expected that when code

* execution begins here, the following registers contain valid, yet

* optional, information:

*

*   r3 - Board info structure pointer (DRAM, frequency, MAC address, etc.)

*   r4 - Starting address of the init RAM disk

*   r5 - Ending address of the init RAM disk

*   r6 - Start of kernel command line string (e.g. "mem=128")

*   r7 - End of kernel command line string

*

*/

Why is there such a discrepancy? The only thing common is the content of register r3. Since board info structure pointer and device tree image seem to be suggest the same thing. No?


Lastly there seems to be quite a few differences between the 'Programming Environments Manual' aka MPCFPE32B and the 'Programmers' Reference Manual (for the e500 family)' aka EREF_RM. The latter one is more specific and has been published at a later date (2011 vs 2005 for the former). Like for instance the programming manual defines such things as the segment registers; some of the MSR bits are different between the two - Former has IR and DR for instruction and data translation whereas the latter does not seem to have these bits and instead defines IS and DS. Im sort of confused here. Which one should I be referring to and why? Ive noticed that it is rather hard to find the right document for the right answer unlike MIPS but then it could just be a case of a rookie finding his way around. Like in the above example, depending on which one I refer to, many things change. In fact the entire address translation mechanism seems to be differently shown in these two documents. Your help would be really appreciated at this moment :smileycry:

0 Kudos

1,963 Views
scottwood
NXP Employee
NXP Employee

Yes, ePAPR is a specification document, not documentation of particular software.  When it says "client program" that means the program, such as Linux, that is being booted via the ePAPR boot mechanism.  The software that does the loading, such as U-Boot, is the "boot program".  The hardware itself is not an ePAPR loader and thus the document is not relevant to how U-Boot starts.

Ignore the Programming Environments Manual.  That is for classic PPC chips such as mpc7xx or mpc83xx and it is not relevant to QorIQ.

Also ignore the comment in head_fsl_booke.S; that is describing the old way of booting that was used in arch/ppc, without a device tree.

0 Kudos

1,963 Views
pegasus711_knig
Contributor II

Hi Scott.

Thanks for the replies. There were quite a few things I asked in that long post last time. So here I will post a few bulletted questions that are very directed and hope I will get what I am looking for. Let me start from the head_fsl_booke.S file. This file is apparently the first file where the control from the u-boot (or any other bootloader for that matter) gets transferred from. So here are a few questions I have about this file:

  • You said
    ignore the comment in head_fsl_booke.S; that is describing the old way of booting that was used in arch/ppc, without a device tree

  if this is indeed the case, then the way things are interpreted by _start routine is of no consequence. The code snippet from this file saves the GPRs3 to 7 to GPRs31 to 27 respectively so that we can use these lower numbered registers for function linkage (as part of the stack frame), or so I think (please correct me if I am wrong here). Heres the code snippet below. As per this, the bootloader is still passing the parameters via registers (or so it seems). Or do we need to refer to a different file under arch/powerpc/boot directory? But a quick glance at that directory and some of the files informed me that the files there are for older u-boot versions which could NOT parse the device dree itself. cuboot.c within this directory clearly says so in the banner. Which again means that head_fsl_booke.S is indeed the correct way. I am all confused now :smileyconfused:

_ENTRY(_start); /* * Reserve a word at a fixed location to store the address * of abatron_pteptrs */ nop /* * Save parameters we are passed */

mr      r31,r3 mr      r30,r4 mr      r29,r5 mr      r28,r6 mr      r27,r7 li      r25,0           /* phys kernel start (low) */ li      r24,0           /* CPU number */ li      r23,0           /* phys kernel start (high) */

  • Assuming that head_fsl_booke.S is indeed the file that gets called from the bootloader, I have a few more questions about the way it does things. Surely it is assembly and that do not help much :smileygrin:. Now the initial part seems to be a lot of cryptic relocation stuff. It also seems that since memory translation on the machine is always enabled, it presents us with an even bigger challenge. Well, more about that later. Since it seems to be fairly involved, I wish to skip it for now and jump to where the kernel code starts. My first question is about the process ID. I see that SPRN_PID0 is loaded with a 0 via r6. Then later in the file, where it says that we start the 'main' kernel code, address of the initial task_struct (called init_task which has been explicitly exported) is loaded into r2. Then we add the offset of the thread field within task_struct to r2, thereby getting the address of the init task's thread_struct and put it into SPRN_SPRG_THREAD. So without taking the pain of traversing the code for fork (and similar system calls that create a new process) can I assume that SPRN_SPRG_THREAD for a given process/thread will always contain the thread_struct for the given process/thread? Does this hold good for both threads and processes? Since in linux thread is nothing more than a light weight process right?
  • Then we deal with the stack. Here it loads r1 with init_thread_union. But this symbol has not been exported like the init_task above. How can the linker access this symbol then?
  • Anyways, moving on, the above union has two members, a thread_info structure and an unsigned long member which is called the stack. So can I assume that when I say the stack for a given process/thread/task in Linux, it is nothing but the thread_info structure for it?
  • After that it stores a 0 into r0 and does a
    stwu r0,THREAD_SIZE-STACK_FRAME_OVERHEAD(r1)
  • This format is rather quirky and I cannot seem to get my head around it. Here we are storing a 0 (content of r0) into the effective address given by r1 + (THREAD_SIZE - STACK_FRAME_OVERHEAD) right? Now r1 contains the base address of the init task's thread info structure. And this expression yields (8192 - 16) which is 8176. So we store a zero at (r1) + 8176
  • This does NOT make any sense to me. I mean the init_thread_union is an union defined in arch/powerpc/kernel/init_task.c. And as mentioned above, this union a thread_info structure and an unsigned long called the stack. lets look at what a thread_info structure contains:

struct thread_info { struct task_struct *task;               /* main task structure */ struct exec_domain *exec_domain;        /* execution domain */ int             cpu;                    /* cpu we're on */ int             preempt_count;          /* 0 => preemptable,<0 => BUG */ struct restart_block restart_block; unsigned long   local_flags;            /* private flags for thread */

/* low level flags - has atomic operations done on it */ unsigned long   flags ____cacheline_aligned_in_smp; }

  • So it contains two pointers, 2 ints and 2 unsigned longs and a structure called restart_block. First I do not see how an offset of 8176 into the thread_info structure for the init task 'yields the stack' for it. Since it is so confusing, let me reproduce the assembly snippet that has got me wondering just what the heck is going on here. And I give you this:

* stack */ 

lis r1,init_thread_union@h

ori r1,r1,init_thread_union@l

li r0,0

stwu r0,THREAD_SIZE-STACK_FRAME_OVERHEAD(r1)

  • If the formatting looks yucky, please remember im having a hard time getting it right myself :smileygrin:. Sometimes everything just appears garbled. Perhaps the webmaster must take notice. Anyways, I guess I should stop for now. Before proceeding further I would be glad if someone could shed some light on the above doubts of mine.

Regards.

0 Kudos

1,962 Views
scottwood
NXP Employee
NXP Employee

Regarding formatting problems, I've seen issues myself and have reported them.  It's weird -- I see proper formatting in one Firefox profile, but not in another.  Not sure if what you're seeing is the same thing.

The bootloader is still passing things in registers, but not the way head_fsl_booke.S describes.  It is described by ePAPR.  The most important one is r3, which points to the device tree.

Yes, SPRN_SPRG_THREAD (a.k.a. SPRG3) points to the current thread_info struct on 32-bit.  SPRG usage is documented in arch/powerpc/include/asm/reg.h.  Yes, this is true regardless of whether it is a multi-threaded program.  What was the question about process ID?

If by "export" you mean EXPORT_SYMBOL(), that's only relevant for modules.  Inside the kernel image, anything that is not "static" is globally visible.

A thread_union has the stack at the top and the thread_info at the bottom.  I'm not sure what you mean by "when I say the stack for a given process/thread/task in Linux, it is nothing but the thread_info structure".  The "stack" field of the union isn't an unsigned long, it's an *array* of unsigned longs -- 8192 bytes worth (on 32-bit).  Hopefully that answers the later confusion.

0 Kudos

1,962 Views
pegasus711_knig
Contributor II

Hi Scott.

Thanks for the replies mate. Regarding 1, yep sp basically head-y is head_fsl_booke.S. Regarding 2, I will need to check it out esp if I would want to understand what u-boot and the early init code is actually doing. Regarding 3, yep ive done that and I am able to see some of the functions being invoked (in which ive added my own printks)

Now, first I am sorry for being a lil too late as I was on leave here and on vacation one tends to forget work. Secondly, to my surprise the management is now fine with using 2.6.34 so it means it makes things easier. I see that there is no ppc and powerpc directories but a single consolidated powerpc directory under arch. Good.

Now if you remember from the other thread with simbu, I was also interested in making my older code run on newer hardware. Hence I needed to set the DCBZ32 bit inside L1CSR0. I am doing that inside the cpu_setup_fsl_booke.S file when DCache is being set up. And ive replaced dcbz with dcbzl. I then check whether the bit has been set in the outer functions and voila..it is..however now my /sbin/init hangs..becomes dead fish..the kernel_execve part fails..it is not able to spawn the init process..ive checked my rootfs (which has been properly NFS mounted as per the console log) and there is indeed /sbin/init. On checking the net, ive come across people having this issue with xilinx's ppc405 and ppc440 'soft cores'. Now just abt anything could be wrong here...any particular directions I should be looking at..perhaps this weekend...??

0 Kudos

1,962 Views
scottwood
NXP Employee
NXP Employee

This is getting off-topic for "Understanding P4080 startup sequence"... please put further discussion of DCBZ32 under the existing DCBZ32 thread, or create a new question.  Definitely don't hijack unrelated year-old questions like here. :-)

You need to either tell userspace that it has a 32-byte cache line (vdso_init() is probably what you want to look at), or be selective in which processes you set DCBZ32 for.

1,962 Views
pegasus711_knig
Contributor II

HI Scott..

Thanks for the suggestion..I thought may be that'd crowd the space here...but i guess itll keep things cleaner..will do that soon

0 Kudos

1,962 Views
scottwood
NXP Employee
NXP Employee

Note that setting DCBZ32 for all processes, even if you fix the VDSO data, may cause a performance decrease that isn't limited to the old code you're trying to be compatible with.

0 Kudos

1,962 Views
pegasus711_knig
Contributor II

Hello Fellas and Ladies.

Could anyone please atleast shed some light on it?? :smileyplain: ...

Atleast something..You need not be a QorIQ guru but I guess if you've had your feet wet..atleast you could give me a few hints..


0 Kudos