[Bug 383240] Re: Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel

Loïc Minier lool at dooz.org
Sat Oct 10 14:46:23 UTC 2009


So I'm taking a fresh look on this bug from the start.  Sorry I didn't
make more time for NEON earlier in the cycle.  This is kind of a brain
dump of where I stand.

First, back to your initial comment:
"ffmpeg also contains some VFP, armv5te and armv6 optimisations. Based on the v6+VFP baseline for Karmic, I suggest that these can be turned on statically, when the karmic toolchain has stabilised sufficiently."

So what you're saying is that we should configure ffmpeg for armv6 + VFP by default in Ubuntu.  I agree and changed:
nooptflags += --disable-armv5te --disable-armv6 --disable-armv6t2
nooptflags += --disable-armvfp --disable-neon

to:
nooptflags += --enable-armv6 --disable-armv6t2
nooptflags += --enable-armvfp --disable-neon

And disabled the now useless VFP flavour.

I didn't add conditions for Debian vs Ubuntu; need to check with
Reinhard how he'd like these to look like.

Since NEON implies v7+ and since you said ffmpeg's NEON implementation requires VFP, I've updated the neon flavour (which can be used on Debian and Ubuntu) to use:
        --shlibdir=/usr/lib/neon/vfp --enable-armvfp -mcpu=armv7-a \
        --extra-cflags="-mfpu=neon -mfloat-abi=softfp"

NB: I used -mcpu=armv7-a instead of -mcpu=cortex-a8 because cortex
implies the fast_unaligned ffmpeg flag.  Could you confirm that
implementing NEON implies fast_unaligned?

I looked at all upstream commits with NEON in the commit message (git
log --grep=NEON -i or gitk --grep=NEON -i).

Then the first missing NEON commit I found was indeed r18332.

I found these interesting revs:

bd97c6665a522e7f64ee1456ed9c39f7cde7234f        18332
        ARM: NEON optimised add_pixels_clamped
2a9b6b26f8aefcbe16ad3dd62c09e7224f55afb4        18333
        ARM: NEON optimized put_signed_pixels_clamped
e496c588b4dda7ef2372dbd562b3e288e3c7c81c        18535
        Add guaranteed alignment for loading dest pixels in avg_pixels16_neon
d1779570309a76649c83f4969583bf8733b942cb        18712
        ARM: NEON put_pixels_clamped
1ac68a162bd6dcc80da9e950368b1fe13d4dadf7        18713
        ARM: Use fewer register in NEON put_pixels _y2 and _xy2
3104a36996462c14e3b84c852e76280676088add        18916
        ARM: NEON VP3 Loop Filter
2a82c2a332b439a3c929e936909c83dfcc147343        18944
        NEON-OBJS should also be cleared for each subdir.
14727897af399dc8480ca09f39f8f93acb8d3029        18972
        ARM: add some PLD in NEON IDCT
8a81301ef5051b8ee571cb3bd0bf4cfbefaf68a3        19216
        ARM: slightly faster NEON H264 horizontal loop filter
03586fda4fb70f488e61294e5200f43650117535        19345
        ARM: NEON VP3 IDCT
eb75e8538b294d7f097a1f53a546c353011d9471        19438
        Require aligned memory for everything that needs it
1dc725bbd6a1cc6a572aff0702753dc3559d657b        19745
        ARM: handle VFP register arguments in ff_vector_fmul_window_neon()
822b4ce43bcb76899277a893e01eda322264402d        19494
        Only compile in NEON optimizations for H.264 when the H.264 decoder is enabled.
71b3800785cd7aaf6f0aed00be58b7ed007f31bb        19637
        ARM: NEON optimised vorbis_inverse_coupling
6768555cd5b51c60b061f3a41fa4ea5536e9b2e2        19806
        ARM: NEON optimised FFT and MDCT
a4f631b0f93557846e258fe3c571cdb25b401cf1        19817
        ARM: faster NEON IMDCT
fe312b587ef084d7d1d97813a6ffed1ff6f70ad8        19819
        ARM: NEON optimised MDCT
a99f03a385a8b026a9b032b0d67c437875ee302f        19940
        ARM: interleave cos/sin tables for improved NEON MDCT
36648139160e5cb3388bbb390656394e23b4eee4        19941
        ARM: merge two loops in ff_mdct_calc_neon
94cd4381dc51d12312f83aa6bb519f17dfe47644        19957
        ARM: NEON optimisations for some dsputil functions
f2f5d248aacafa68a9a1f772844a1eb4f6e720ed        19971
        ARM: NEON optimised scalarproduct_float
84acbb94ee14bbaee5f8a1eccaee1750d00f77a2        20000
        ARM: NEON optimised int32_to_float_fmul_scalar
c14a6a4c6f7b62061bae87d45d8bb667d2e5064c        20029
        ARM: NEON optimised vector_fmul_reverse
ef75b6ccb7503ea8818f15ce403ea332d64963a5        20031
        ARM: NEON optimised vector_clipf
927725a4a687d89881a4a7486f892f7566f7a7cb        20063
        ARM: NEON optimised vector_fmul_add
a5dc149d7439ca72f37364bb7a7829f85c1e4eb8        20151
        ARM: remove unnecessary .fpu neon directives
e7f5e7b5b8de7b0540c7db7f100b91b0a05760a7        20163
        ARM: clean up dsputil initialisation

And added the revs you mentionned in comment #20 to the list (superset of the
list in comment #9):
9e80964529f669a752c25832a6d7c64adb32c819        18600
        Reorganise intreadwrite.h
b2ff39618d600db05f527c641e2379fe85237fee        18601
        ARM asm for AV_RN*()
19760072819c8e8130d1d6f78c7de10d4d1fec4c        18917
        ARM: actually add VP3 loop filter
14cbfa75169225e2dc04a69c4c902ccc68022493        19308
        ARM: enable fast_unaligned when --cpu=armv[67] is specified

Then Mans suggested an additional ones in comment #23:
10f371694ae5525b084c2ccddc24c2a3912b6492        19818
        Prepare for optimised forward MDCT implementations

and one more in comment #26:
89a1e0a4a528bc3c6ac12e8582728ca77ea922e4        19846
        ARM: 10l: fix large FFTs

When I then went to apply I dropped some of these as they didn't apply:
18944
19438
19494
19940
19941
20163

And had to do a manual merge on 19806.

After this quite brutal approach, I wasn't too surprized that my build didn't pass; it's currently failing with:
gcc -DHAVE_AV_CONFIG_H -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -I. -I"/build/buildd/ffmpeg-0.5+svn20090706" -D_ISOC99_SOURCE -D_POSIX_C_SOURCE=200112 -I/build/buildd/ffmpeg-0.5+svn20090706/debian/include -mfpu=neon -mfloat-abi=softfp -std=c99 -fomit-frame-pointer -pthread -I/usr/include/schroedinger-1.0 -I/usr/include/liboil-0.3 -g -Wdeclaration-after-statement -Wall -Wno-switch -Wdisabled-optimization -Wpointer-arith -Wredundant-decls -Wno-pointer-sign -Wcast-qual -Wwrite-strings -Wtype-limits -Wundef -O3 -fno-math-errno -fno-signed-zeros          -c -o libavcodec/arm/mpegvideo_armv5te_s.o /build/buildd/ffmpeg-0.5+svn20090706/libavcodec/arm/mpegvideo_armv5te_s.S
gcc -DHAVE_AV_CONFIG_H -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -I. -I"/build/buildd/ffmpeg-0.5+svn20090706" -D_ISOC99_SOURCE -D_POSIX_C_SOURCE=200112 -I/build/buildd/ffmpeg-0.5+svn20090706/debian/include -mfpu=neon -mfloat-abi=softfp -std=c99 -fomit-frame-pointer -pthread -I/usr/include/schroedinger-1.0 -I/usr/include/liboil-0.3 -g -Wdeclaration-after-statement -Wall -Wno-switch -Wdisabled-optimization -Wpointer-arith -Wredundant-decls -Wno-pointer-sign -Wcast-qual -Wwrite-strings -Wtype-limits -Wundef -O3 -fno-math-errno -fno-signed-zeros          -c -o libavcodec/arm/simple_idct_armv5te.o /build/buildd/ffmpeg-0.5+svn20090706/libavcodec/arm/simple_idct_armv5te.S
gcc -DHAVE_AV_CONFIG_H -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -I. -I"/build/buildd/ffmpeg-0.5+svn20090706" -D_ISOC99_SOURCE -D_POSIX_C_SOURCE=200112 -I/build/buildd/ffmpeg-0.5+svn20090706/debian/include -mfpu=neon -mfloat-abi=softfp -std=c99 -fomit-frame-pointer -pthread -I/usr/include/schroedinger-1.0 -I/usr/include/liboil-0.3 -g -Wdeclaration-after-statement -Wall -Wno-switch -Wdisabled-optimization -Wpointer-arith -Wredundant-decls -Wno-pointer-sign -Wcast-qual -Wwrite-strings -Wtype-limits -Wundef -O3 -fno-math-errno -fno-signed-zeros          -c -o libavcodec/arm/simple_idct_armv6.o /build/buildd/ffmpeg-0.5+svn20090706/libavcodec/arm/simple_idct_armv6.S
gcc -DHAVE_AV_CONFIG_H -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -I. -I"/build/buildd/ffmpeg-0.5+svn20090706" -D_ISOC99_SOURCE -D_POSIX_C_SOURCE=200112 -I/build/buildd/ffmpeg-0.5+svn20090706/debian/include -mfpu=neon -mfloat-abi=softfp -std=c99 -fomit-frame-pointer -pthread -I/usr/include/schroedinger-1.0 -I/usr/include/liboil-0.3 -g -Wdeclaration-after-statement -Wall -Wno-switch -Wdisabled-optimization -Wpointer-arith -Wredundant-decls -Wno-pointer-sign -Wcast-qual -Wwrite-strings -Wtype-limits -Wundef -O3 -fno-math-errno -fno-signed-zeros          -c -o libavcodec/arm/dsputil_vfp.o /build/buildd/ffmpeg-0.5+svn20090706/libavcodec/arm/dsputil_vfp.S
gcc -DHAVE_AV_CONFIG_H -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -I. -I"/build/buildd/ffmpeg-0.5+svn20090706" -D_ISOC99_SOURCE -D_POSIX_C_SOURCE=200112 -I/build/buildd/ffmpeg-0.5+svn20090706/debian/include -mfpu=neon -mfloat-abi=softfp -std=c99 -fomit-frame-pointer -pthread -I/usr/include/schroedinger-1.0 -I/usr/include/liboil-0.3 -g -Wdeclaration-after-statement -Wall -Wno-switch -Wdisabled-optimization -Wpointer-arith -Wredundant-decls -Wno-pointer-sign -Wcast-qual -Wwrite-strings -Wtype-limits -Wundef -O3 -fno-math-errno -fno-signed-zeros          -c -o libavcodec/arm/float_arm_vfp.o /build/buildd/ffmpeg-0.5+svn20090706/libavcodec/arm/float_arm_vfp.c
gcc -DHAVE_AV_CONFIG_H -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -I. -I"/build/buildd/ffmpeg-0.5+svn20090706" -D_ISOC99_SOURCE -D_POSIX_C_SOURCE=200112 -I/build/buildd/ffmpeg-0.5+svn20090706/debian/include -mfpu=neon -mfloat-abi=softfp -std=c99 -fomit-frame-pointer -pthread -I/usr/include/schroedinger-1.0 -I/usr/include/liboil-0.3 -g -Wdeclaration-after-statement -Wall -Wno-switch -Wdisabled-optimization -Wpointer-arith -Wredundant-decls -Wno-pointer-sign -Wcast-qual -Wwrite-strings -Wtype-limits -Wundef -O3 -fno-math-errno -fno-signed-zeros          -c -o libavcodec/arm/dsputil_neon.o /build/buildd/ffmpeg-0.5+svn20090706/libavcodec/arm/dsputil_neon.c
/build/buildd/ffmpeg-0.5+svn20090706/libavcodec/arm/dsputil_neon.c: In function 'ff_dsputil_init_neon':
/build/buildd/ffmpeg-0.5+svn20090706/libavcodec/arm/dsputil_neon.c:290: error: 'DSPContext' has no member named 'vector_fmul_scalar'
/build/buildd/ffmpeg-0.5+svn20090706/libavcodec/arm/dsputil_neon.c:291: error: 'DSPContext' has no member named 'butterflies_float'
/build/buildd/ffmpeg-0.5+svn20090706/libavcodec/arm/dsputil_neon.c:292: error: 'DSPContext' has no member named 'scalarproduct_float'
/build/buildd/ffmpeg-0.5+svn20090706/libavcodec/arm/dsputil_neon.c:295: error: 'DSPContext' has no member named 'vector_fmul_add'
/build/buildd/ffmpeg-0.5+svn20090706/libavcodec/arm/dsputil_neon.c:297: error: 'DSPContext' has no member named 'vector_fmul_sv_scalar'
/build/buildd/ffmpeg-0.5+svn20090706/libavcodec/arm/dsputil_neon.c:298: error: 'DSPContext' has no member named 'vector_fmul_sv_scalar'
/build/buildd/ffmpeg-0.5+svn20090706/libavcodec/arm/dsputil_neon.c:300: error: 'DSPContext' has no member named 'sv_fmul_scalar'
/build/buildd/ffmpeg-0.5+svn20090706/libavcodec/arm/dsputil_neon.c:301: error: 'DSPContext' has no member named 'sv_fmul_scalar'
/build/buildd/ffmpeg-0.5+svn20090706/libavcodec/arm/dsputil_neon.c:303: error: 'DSPContext' has no member named 'vector_clipf'
make[1]: *** [libavcodec/arm/dsputil_neon.o] Error 1
make[1]: Leaving directory `/build/buildd/ffmpeg-0.5+svn20090706/debian-neon'
make: *** [build-stamp-neon] Error 2
rm configure-stamp-neon configure-stamp-static configure-stamp-shared
dpkg-buildpackage: error: debian/rules build gave error exit status 2

I need to research which commit I miss (perhaps one I didn't catch in my
list or perhaps one I skipped).

-- 
Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel
https://bugs.launchpad.net/bugs/383240
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs at lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs




More information about the universe-bugs mailing list