[Bug 383240] Re: Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel
Loïc Minier
lool at dooz.org
Sat Oct 10 14:46:23 UTC 2009
So I'm taking a fresh look on this bug from the start. Sorry I didn't
make more time for NEON earlier in the cycle. This is kind of a brain
dump of where I stand.
First, back to your initial comment:
"ffmpeg also contains some VFP, armv5te and armv6 optimisations. Based on the v6+VFP baseline for Karmic, I suggest that these can be turned on statically, when the karmic toolchain has stabilised sufficiently."
So what you're saying is that we should configure ffmpeg for armv6 + VFP by default in Ubuntu. I agree and changed:
nooptflags += --disable-armv5te --disable-armv6 --disable-armv6t2
nooptflags += --disable-armvfp --disable-neon
to:
nooptflags += --enable-armv6 --disable-armv6t2
nooptflags += --enable-armvfp --disable-neon
And disabled the now useless VFP flavour.
I didn't add conditions for Debian vs Ubuntu; need to check with
Reinhard how he'd like these to look like.
Since NEON implies v7+ and since you said ffmpeg's NEON implementation requires VFP, I've updated the neon flavour (which can be used on Debian and Ubuntu) to use:
--shlibdir=/usr/lib/neon/vfp --enable-armvfp -mcpu=armv7-a \
--extra-cflags="-mfpu=neon -mfloat-abi=softfp"
NB: I used -mcpu=armv7-a instead of -mcpu=cortex-a8 because cortex
implies the fast_unaligned ffmpeg flag. Could you confirm that
implementing NEON implies fast_unaligned?
I looked at all upstream commits with NEON in the commit message (git
log --grep=NEON -i or gitk --grep=NEON -i).
Then the first missing NEON commit I found was indeed r18332.
I found these interesting revs:
bd97c6665a522e7f64ee1456ed9c39f7cde7234f 18332
ARM: NEON optimised add_pixels_clamped
2a9b6b26f8aefcbe16ad3dd62c09e7224f55afb4 18333
ARM: NEON optimized put_signed_pixels_clamped
e496c588b4dda7ef2372dbd562b3e288e3c7c81c 18535
Add guaranteed alignment for loading dest pixels in avg_pixels16_neon
d1779570309a76649c83f4969583bf8733b942cb 18712
ARM: NEON put_pixels_clamped
1ac68a162bd6dcc80da9e950368b1fe13d4dadf7 18713
ARM: Use fewer register in NEON put_pixels _y2 and _xy2
3104a36996462c14e3b84c852e76280676088add 18916
ARM: NEON VP3 Loop Filter
2a82c2a332b439a3c929e936909c83dfcc147343 18944
NEON-OBJS should also be cleared for each subdir.
14727897af399dc8480ca09f39f8f93acb8d3029 18972
ARM: add some PLD in NEON IDCT
8a81301ef5051b8ee571cb3bd0bf4cfbefaf68a3 19216
ARM: slightly faster NEON H264 horizontal loop filter
03586fda4fb70f488e61294e5200f43650117535 19345
ARM: NEON VP3 IDCT
eb75e8538b294d7f097a1f53a546c353011d9471 19438
Require aligned memory for everything that needs it
1dc725bbd6a1cc6a572aff0702753dc3559d657b 19745
ARM: handle VFP register arguments in ff_vector_fmul_window_neon()
822b4ce43bcb76899277a893e01eda322264402d 19494
Only compile in NEON optimizations for H.264 when the H.264 decoder is enabled.
71b3800785cd7aaf6f0aed00be58b7ed007f31bb 19637
ARM: NEON optimised vorbis_inverse_coupling
6768555cd5b51c60b061f3a41fa4ea5536e9b2e2 19806
ARM: NEON optimised FFT and MDCT
a4f631b0f93557846e258fe3c571cdb25b401cf1 19817
ARM: faster NEON IMDCT
fe312b587ef084d7d1d97813a6ffed1ff6f70ad8 19819
ARM: NEON optimised MDCT
a99f03a385a8b026a9b032b0d67c437875ee302f 19940
ARM: interleave cos/sin tables for improved NEON MDCT
36648139160e5cb3388bbb390656394e23b4eee4 19941
ARM: merge two loops in ff_mdct_calc_neon
94cd4381dc51d12312f83aa6bb519f17dfe47644 19957
ARM: NEON optimisations for some dsputil functions
f2f5d248aacafa68a9a1f772844a1eb4f6e720ed 19971
ARM: NEON optimised scalarproduct_float
84acbb94ee14bbaee5f8a1eccaee1750d00f77a2 20000
ARM: NEON optimised int32_to_float_fmul_scalar
c14a6a4c6f7b62061bae87d45d8bb667d2e5064c 20029
ARM: NEON optimised vector_fmul_reverse
ef75b6ccb7503ea8818f15ce403ea332d64963a5 20031
ARM: NEON optimised vector_clipf
927725a4a687d89881a4a7486f892f7566f7a7cb 20063
ARM: NEON optimised vector_fmul_add
a5dc149d7439ca72f37364bb7a7829f85c1e4eb8 20151
ARM: remove unnecessary .fpu neon directives
e7f5e7b5b8de7b0540c7db7f100b91b0a05760a7 20163
ARM: clean up dsputil initialisation
And added the revs you mentionned in comment #20 to the list (superset of the
list in comment #9):
9e80964529f669a752c25832a6d7c64adb32c819 18600
Reorganise intreadwrite.h
b2ff39618d600db05f527c641e2379fe85237fee 18601
ARM asm for AV_RN*()
19760072819c8e8130d1d6f78c7de10d4d1fec4c 18917
ARM: actually add VP3 loop filter
14cbfa75169225e2dc04a69c4c902ccc68022493 19308
ARM: enable fast_unaligned when --cpu=armv[67] is specified
Then Mans suggested an additional ones in comment #23:
10f371694ae5525b084c2ccddc24c2a3912b6492 19818
Prepare for optimised forward MDCT implementations
and one more in comment #26:
89a1e0a4a528bc3c6ac12e8582728ca77ea922e4 19846
ARM: 10l: fix large FFTs
When I then went to apply I dropped some of these as they didn't apply:
18944
19438
19494
19940
19941
20163
And had to do a manual merge on 19806.
After this quite brutal approach, I wasn't too surprized that my build didn't pass; it's currently failing with:
gcc -DHAVE_AV_CONFIG_H -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -I. -I"/build/buildd/ffmpeg-0.5+svn20090706" -D_ISOC99_SOURCE -D_POSIX_C_SOURCE=200112 -I/build/buildd/ffmpeg-0.5+svn20090706/debian/include -mfpu=neon -mfloat-abi=softfp -std=c99 -fomit-frame-pointer -pthread -I/usr/include/schroedinger-1.0 -I/usr/include/liboil-0.3 -g -Wdeclaration-after-statement -Wall -Wno-switch -Wdisabled-optimization -Wpointer-arith -Wredundant-decls -Wno-pointer-sign -Wcast-qual -Wwrite-strings -Wtype-limits -Wundef -O3 -fno-math-errno -fno-signed-zeros -c -o libavcodec/arm/mpegvideo_armv5te_s.o /build/buildd/ffmpeg-0.5+svn20090706/libavcodec/arm/mpegvideo_armv5te_s.S
gcc -DHAVE_AV_CONFIG_H -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -I. -I"/build/buildd/ffmpeg-0.5+svn20090706" -D_ISOC99_SOURCE -D_POSIX_C_SOURCE=200112 -I/build/buildd/ffmpeg-0.5+svn20090706/debian/include -mfpu=neon -mfloat-abi=softfp -std=c99 -fomit-frame-pointer -pthread -I/usr/include/schroedinger-1.0 -I/usr/include/liboil-0.3 -g -Wdeclaration-after-statement -Wall -Wno-switch -Wdisabled-optimization -Wpointer-arith -Wredundant-decls -Wno-pointer-sign -Wcast-qual -Wwrite-strings -Wtype-limits -Wundef -O3 -fno-math-errno -fno-signed-zeros -c -o libavcodec/arm/simple_idct_armv5te.o /build/buildd/ffmpeg-0.5+svn20090706/libavcodec/arm/simple_idct_armv5te.S
gcc -DHAVE_AV_CONFIG_H -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -I. -I"/build/buildd/ffmpeg-0.5+svn20090706" -D_ISOC99_SOURCE -D_POSIX_C_SOURCE=200112 -I/build/buildd/ffmpeg-0.5+svn20090706/debian/include -mfpu=neon -mfloat-abi=softfp -std=c99 -fomit-frame-pointer -pthread -I/usr/include/schroedinger-1.0 -I/usr/include/liboil-0.3 -g -Wdeclaration-after-statement -Wall -Wno-switch -Wdisabled-optimization -Wpointer-arith -Wredundant-decls -Wno-pointer-sign -Wcast-qual -Wwrite-strings -Wtype-limits -Wundef -O3 -fno-math-errno -fno-signed-zeros -c -o libavcodec/arm/simple_idct_armv6.o /build/buildd/ffmpeg-0.5+svn20090706/libavcodec/arm/simple_idct_armv6.S
gcc -DHAVE_AV_CONFIG_H -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -I. -I"/build/buildd/ffmpeg-0.5+svn20090706" -D_ISOC99_SOURCE -D_POSIX_C_SOURCE=200112 -I/build/buildd/ffmpeg-0.5+svn20090706/debian/include -mfpu=neon -mfloat-abi=softfp -std=c99 -fomit-frame-pointer -pthread -I/usr/include/schroedinger-1.0 -I/usr/include/liboil-0.3 -g -Wdeclaration-after-statement -Wall -Wno-switch -Wdisabled-optimization -Wpointer-arith -Wredundant-decls -Wno-pointer-sign -Wcast-qual -Wwrite-strings -Wtype-limits -Wundef -O3 -fno-math-errno -fno-signed-zeros -c -o libavcodec/arm/dsputil_vfp.o /build/buildd/ffmpeg-0.5+svn20090706/libavcodec/arm/dsputil_vfp.S
gcc -DHAVE_AV_CONFIG_H -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -I. -I"/build/buildd/ffmpeg-0.5+svn20090706" -D_ISOC99_SOURCE -D_POSIX_C_SOURCE=200112 -I/build/buildd/ffmpeg-0.5+svn20090706/debian/include -mfpu=neon -mfloat-abi=softfp -std=c99 -fomit-frame-pointer -pthread -I/usr/include/schroedinger-1.0 -I/usr/include/liboil-0.3 -g -Wdeclaration-after-statement -Wall -Wno-switch -Wdisabled-optimization -Wpointer-arith -Wredundant-decls -Wno-pointer-sign -Wcast-qual -Wwrite-strings -Wtype-limits -Wundef -O3 -fno-math-errno -fno-signed-zeros -c -o libavcodec/arm/float_arm_vfp.o /build/buildd/ffmpeg-0.5+svn20090706/libavcodec/arm/float_arm_vfp.c
gcc -DHAVE_AV_CONFIG_H -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -I. -I"/build/buildd/ffmpeg-0.5+svn20090706" -D_ISOC99_SOURCE -D_POSIX_C_SOURCE=200112 -I/build/buildd/ffmpeg-0.5+svn20090706/debian/include -mfpu=neon -mfloat-abi=softfp -std=c99 -fomit-frame-pointer -pthread -I/usr/include/schroedinger-1.0 -I/usr/include/liboil-0.3 -g -Wdeclaration-after-statement -Wall -Wno-switch -Wdisabled-optimization -Wpointer-arith -Wredundant-decls -Wno-pointer-sign -Wcast-qual -Wwrite-strings -Wtype-limits -Wundef -O3 -fno-math-errno -fno-signed-zeros -c -o libavcodec/arm/dsputil_neon.o /build/buildd/ffmpeg-0.5+svn20090706/libavcodec/arm/dsputil_neon.c
/build/buildd/ffmpeg-0.5+svn20090706/libavcodec/arm/dsputil_neon.c: In function 'ff_dsputil_init_neon':
/build/buildd/ffmpeg-0.5+svn20090706/libavcodec/arm/dsputil_neon.c:290: error: 'DSPContext' has no member named 'vector_fmul_scalar'
/build/buildd/ffmpeg-0.5+svn20090706/libavcodec/arm/dsputil_neon.c:291: error: 'DSPContext' has no member named 'butterflies_float'
/build/buildd/ffmpeg-0.5+svn20090706/libavcodec/arm/dsputil_neon.c:292: error: 'DSPContext' has no member named 'scalarproduct_float'
/build/buildd/ffmpeg-0.5+svn20090706/libavcodec/arm/dsputil_neon.c:295: error: 'DSPContext' has no member named 'vector_fmul_add'
/build/buildd/ffmpeg-0.5+svn20090706/libavcodec/arm/dsputil_neon.c:297: error: 'DSPContext' has no member named 'vector_fmul_sv_scalar'
/build/buildd/ffmpeg-0.5+svn20090706/libavcodec/arm/dsputil_neon.c:298: error: 'DSPContext' has no member named 'vector_fmul_sv_scalar'
/build/buildd/ffmpeg-0.5+svn20090706/libavcodec/arm/dsputil_neon.c:300: error: 'DSPContext' has no member named 'sv_fmul_scalar'
/build/buildd/ffmpeg-0.5+svn20090706/libavcodec/arm/dsputil_neon.c:301: error: 'DSPContext' has no member named 'sv_fmul_scalar'
/build/buildd/ffmpeg-0.5+svn20090706/libavcodec/arm/dsputil_neon.c:303: error: 'DSPContext' has no member named 'vector_clipf'
make[1]: *** [libavcodec/arm/dsputil_neon.o] Error 1
make[1]: Leaving directory `/build/buildd/ffmpeg-0.5+svn20090706/debian-neon'
make: *** [build-stamp-neon] Error 2
rm configure-stamp-neon configure-stamp-static configure-stamp-shared
dpkg-buildpackage: error: debian/rules build gave error exit status 2
I need to research which commit I miss (perhaps one I didn't catch in my
list or perhaps one I skipped).
--
Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel
https://bugs.launchpad.net/bugs/383240
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
--
ubuntu-bugs mailing list
ubuntu-bugs at lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
More information about the universe-bugs
mailing list