untrack venv

This commit is contained in:
Greg DiCristofaro 2021-01-25 14:44:55 -05:00
parent 4df73f0532
commit 9365749d61
1068 changed files with 0 additions and 267746 deletions

View File

@ -1,46 +0,0 @@
GitPython was originally written by Michael Trier.
GitPython 0.2 was partially (re)written by Sebastian Thiel, based on 0.1.6 and git-dulwich.
Contributors are:
-Michael Trier <mtrier _at_ gmail.com>
-Alan Briolat
-Florian Apolloner <florian _at_ apolloner.eu>
-David Aguilar <davvid _at_ gmail.com>
-Jelmer Vernooij <jelmer _at_ samba.org>
-Steve Frécinaux <code _at_ istique.net>
-Kai Lautaportti <kai _at_ lautaportti.fi>
-Paul Sowden <paul _at_ idontsmoke.co.uk>
-Sebastian Thiel <byronimo _at_ gmail.com>
-Jonathan Chu <jonathan.chu _at_ me.com>
-Vincent Driessen <me _at_ nvie.com>
-Phil Elson <pelson _dot_ pub _at_ gmail.com>
-Bernard `Guyzmo` Pratz <guyzmo+gitpython+pub@m0g.net>
-Timothy B. Hartman <tbhartman _at_ gmail.com>
-Konstantin Popov <konstantin.popov.89 _at_ yandex.ru>
-Peter Jones <pjones _at_ redhat.com>
-Anson Mansfield <anson.mansfield _at_ gmail.com>
-Ken Odegard <ken.odegard _at_ gmail.com>
-Alexis Horgix Chotard
-Piotr Babij <piotr.babij _at_ gmail.com>
-Mikuláš Poul <mikulaspoul _at_ gmail.com>
-Charles Bouchard-Légaré <cblegare.atl _at_ ntis.ca>
-Yaroslav Halchenko <debian _at_ onerussian.com>
-Tim Swast <swast _at_ google.com>
-William Luc Ritchie
-David Host <hostdm _at_ outlook.com>
-A. Jesse Jiryu Davis <jesse _at_ emptysquare.net>
-Steven Whitman <ninloot _at_ gmail.com>
-Stefan Stancu <stefan.stancu _at_ gmail.com>
-César Izurieta <cesar _at_ caih.org>
-Arthur Milchior <arthur _at_ milchior.fr>
-Anil Khatri <anil.soccer.khatri _at_ gmail.com>
-JJ Graham <thetwoj _at_ gmail.com>
-Ben Thayer <ben _at_ benthayer.com>
-Dries Kennes <admin _at_ dries007.net>
-Pratik Anurag <panurag247365 _at_ gmail.com>
-Harmon <harmon.public _at_ gmail.com>
-Liam Beguin <liambeguin _at_ gmail.com>
-Ram Rachum <ram _at_ rachum.com>
-Alba Mendez <me _at_ alba.sh>
Portions derived from other open source works and are clearly marked.

View File

@ -1,30 +0,0 @@
Copyright (C) 2008, 2009 Michael Trier and contributors
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:
* Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution.
* Neither the name of the GitPython project nor the names of
its contributors may be used to endorse or promote products derived
from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED
TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

View File

@ -1,30 +0,0 @@
Metadata-Version: 2.1
Name: GitPython
Version: 3.1.12
Summary: Python Git Library
Home-page: https://github.com/gitpython-developers/GitPython
Author: Sebastian Thiel, Michael Trier
Author-email: byronimo@gmail.com, mtrier@gmail.com
License: UNKNOWN
Platform: UNKNOWN
Classifier: Development Status :: 5 - Production/Stable
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: BSD License
Classifier: Operating System :: OS Independent
Classifier: Operating System :: POSIX
Classifier: Operating System :: Microsoft :: Windows
Classifier: Operating System :: MacOS :: MacOS X
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.4
Classifier: Programming Language :: Python :: 3.5
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Requires-Python: >=3.4
Requires-Dist: gitdb (<5,>=4.0.1)
GitPython is a python library used to interact with Git repositories

View File

@ -1,80 +0,0 @@
GitPython-3.1.12.dist-info/AUTHORS,sha256=VCeono6UyWEV2VF2yhiW-B_jDuWxOK0Id49rmd3ZEgY,1875
GitPython-3.1.12.dist-info/INSTALLER,sha256=zuuue4knoyJ-UwPPXg8fezS7VCrXJQrAP7zeNuwvFQg,4
GitPython-3.1.12.dist-info/LICENSE,sha256=_WV__CzvY9JceMq3gI1BTdA6KC5jiTSR_RHDL5i-Z_s,1521
GitPython-3.1.12.dist-info/METADATA,sha256=1URCoo1EU8BQGPvb93zDOHEXy3MBD0VStgdLp78V_AM,1111
GitPython-3.1.12.dist-info/RECORD,,
GitPython-3.1.12.dist-info/REQUESTED,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
GitPython-3.1.12.dist-info/WHEEL,sha256=U88EhGIw8Sj2_phqajeu_EAi3RAo8-C6zV3REsWbWbs,92
GitPython-3.1.12.dist-info/top_level.txt,sha256=0hzDuIp8obv624V3GmbqsagBWkk8ohtGU-Bc1PmTT0o,4
git/__init__.py,sha256=G1n9ROBabuqd5j2YpmnXmmsHq7gDsODk8lHkBe4lRBg,2401
git/__pycache__/__init__.cpython-39.pyc,,
git/__pycache__/cmd.cpython-39.pyc,,
git/__pycache__/compat.cpython-39.pyc,,
git/__pycache__/config.cpython-39.pyc,,
git/__pycache__/db.cpython-39.pyc,,
git/__pycache__/diff.cpython-39.pyc,,
git/__pycache__/exc.cpython-39.pyc,,
git/__pycache__/remote.cpython-39.pyc,,
git/__pycache__/util.cpython-39.pyc,,
git/cmd.py,sha256=HLQoUUZFUrXjs0XcbrbHN1Tv7ZznMzCKQnH0BoT8g2M,42378
git/compat.py,sha256=Iza-8GxsW4iF2TRvud11AtVtZXP7AY9wtVAB45guQc8,1928
git/config.py,sha256=8ZVbE6SfZRx90B_CXjn4xGNk3H5wp9Ezp-0eSJb88C8,30292
git/db.py,sha256=NwCZmj4yXlFF5KMxV0hJkE8TOhoR9Kqq49buDpZrhE8,1975
git/diff.py,sha256=xXa4-aGj28obrvRKwMM2_23i_yv3zeSY9NSI98smoxk,20020
git/exc.py,sha256=yUMJLUdOph6hhQrmcra83KUlvttJmtNvgbRiMwgvSZ0,4841
git/index/__init__.py,sha256=Wj5zgJggZkEXueEDXdijxXahzxhextC08k70n0lHRN0,129
git/index/__pycache__/__init__.cpython-39.pyc,,
git/index/__pycache__/base.cpython-39.pyc,,
git/index/__pycache__/fun.cpython-39.pyc,,
git/index/__pycache__/typ.cpython-39.pyc,,
git/index/__pycache__/util.cpython-39.pyc,,
git/index/base.py,sha256=OavMKoK7MULdqx6mKzCIslMWrzT-s5zRw0Degdn_uuA,52258
git/index/fun.py,sha256=tATpu2q8NKV5l4Yo_eHXfr4N63T0-DU9N-1aEtlyoC0,14226
git/index/typ.py,sha256=GLBZbDS3yScHJs0U18CX-heLTDjjGu6fN7T2L_NQr4A,4976
git/index/util.py,sha256=l6oh9_1KU1v5GQdpxqCOqs6WLt5xN1uWvkVHQqcCToA,2902
git/objects/__init__.py,sha256=6C02LlMygiFwTYtncz3GxEQfzHZr2WvUId0fnJ8HfLo,683
git/objects/__pycache__/__init__.cpython-39.pyc,,
git/objects/__pycache__/base.cpython-39.pyc,,
git/objects/__pycache__/blob.cpython-39.pyc,,
git/objects/__pycache__/commit.cpython-39.pyc,,
git/objects/__pycache__/fun.cpython-39.pyc,,
git/objects/__pycache__/tag.cpython-39.pyc,,
git/objects/__pycache__/tree.cpython-39.pyc,,
git/objects/__pycache__/util.cpython-39.pyc,,
git/objects/base.py,sha256=UZiyzyzx4_OJ3bWnwqb3mqh0LXT7oo0biYaTm-sLuAw,6689
git/objects/blob.py,sha256=evI3ptPmlln6gLpoQRvbIKjK4v59nT8ipd1vk1dGYtc,927
git/objects/commit.py,sha256=Rm--z-JlMXm9caZTD6MDrR-_NpwsxbPoS0XTgp9UNxU,20722
git/objects/fun.py,sha256=XtKasOI5XI_nqtPoRihqF3hPe1BxvRhqqtanVlxVBzg,7257
git/objects/submodule/__init__.py,sha256=OsMeiex7cG6ev2f35IaJ5csH-eXchSoNKCt4HXUG5Ws,93
git/objects/submodule/__pycache__/__init__.cpython-39.pyc,,
git/objects/submodule/__pycache__/base.cpython-39.pyc,,
git/objects/submodule/__pycache__/root.cpython-39.pyc,,
git/objects/submodule/__pycache__/util.cpython-39.pyc,,
git/objects/submodule/base.py,sha256=FVIqJs3fBR_twCS4e7fsxJ0UgPLBART_3xaB45fEunc,55178
git/objects/submodule/root.py,sha256=yT5SMk3uoB2gWijiZyIcFchBzVausNeyRI9O9ty4FsM,17659
git/objects/submodule/util.py,sha256=VdgIG-cBo47b_7JcolAvjWaIMU0X5oImLjJ4wluc_iw,2745
git/objects/tag.py,sha256=-pnz8JJrwYAN7Ol6unalU-IfTBNLHyPvX0JM4Mbga_w,3138
git/objects/tree.py,sha256=fnv7wRM5vpWBpgbitHHKmmTkE1AbHADsz7FKb9cRs_M,10996
git/objects/util.py,sha256=dWbmAYXm9N5w0K9HbekduxLCwUQetNfBrsBXfORJBbQ,12778
git/refs/__init__.py,sha256=3CRfAyE-Z78rJ3kSdKR1PNiXHEjHLw2VkU2JyDviNDU,242
git/refs/__pycache__/__init__.cpython-39.pyc,,
git/refs/__pycache__/head.cpython-39.pyc,,
git/refs/__pycache__/log.cpython-39.pyc,,
git/refs/__pycache__/reference.cpython-39.pyc,,
git/refs/__pycache__/remote.cpython-39.pyc,,
git/refs/__pycache__/symbolic.cpython-39.pyc,,
git/refs/__pycache__/tag.cpython-39.pyc,,
git/refs/head.py,sha256=KY_-Hgm3JDJParX380zxQv5-slxtTNnUE8xs--8nt9U,8706
git/refs/log.py,sha256=DTBUIdOBPD9-aRKrNs4LA8fqpRVJq3CSwfAx14ZIGKc,10625
git/refs/reference.py,sha256=OcQMwHJuelR1yKe1EF0IBfxeQZYv2kf0xunNSVwZV-M,4408
git/refs/remote.py,sha256=6JOyIurnomM3tNXdKRXfMK_V75gJNgr9_2sdevKU_tI,1670
git/refs/symbolic.py,sha256=kiVM2j2KqdGcSwGZ4MgekJ42mi1I-csEicGKz3ZDYzc,26966
git/refs/tag.py,sha256=qoHwJ9suHx8u8NNg-6GvNftK36RnCNkpElRjh2r9wcI,2964
git/remote.py,sha256=giPVw-V7G0LZ9WPkptW6muMc0cvNVcjC9dcmrw-PIfo,36047
git/repo/__init__.py,sha256=ssUH4IVCoua5shI5h1l46P0X1kp82ydxVcH3PIVCnzg,108
git/repo/__pycache__/__init__.cpython-39.pyc,,
git/repo/__pycache__/base.cpython-39.pyc,,
git/repo/__pycache__/fun.cpython-39.pyc,,
git/repo/base.py,sha256=idlwTfkcZkG_xiNNIeI1BVSXt0eyR9skFnKdvqzvv1U,44978
git/repo/fun.py,sha256=xDA5rKLYqCxS1Xn7nBMyP-M9IgEYv0MzYQ8nHthhtTk,11492
git/util.py,sha256=YjAU2Bi-0XiNkRvoRZwSDoJHlLs3AJmQJ7yqH0nDSiU,31583

View File

@ -1,5 +0,0 @@
Wheel-Version: 1.0
Generator: bdist_wheel (0.33.1)
Root-Is-Purelib: true
Tag: py3-none-any

View File

@ -1,26 +0,0 @@
Copyright (c) 2013-2020, John McNamara <jmcnamara@cpan.org>
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:
1. Redistributions of source code must retain the above copyright notice, this
list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
The views and conclusions contained in the software and documentation are those
of the authors and should not be interpreted as representing official policies,
either expressed or implied, of the FreeBSD Project.

View File

@ -1,90 +0,0 @@
Metadata-Version: 2.1
Name: XlsxWriter
Version: 1.3.7
Summary: A Python module for creating Excel XLSX files.
Home-page: https://github.com/jmcnamara/XlsxWriter
Author: John McNamara
Author-email: jmcnamara@cpan.org
License: BSD
Platform: UNKNOWN
Classifier: Development Status :: 5 - Production/Stable
Classifier: License :: OSI Approved :: BSD License
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 2
Classifier: Programming Language :: Python :: 2.7
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.4
Classifier: Programming Language :: Python :: 3.5
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
XlsxWriter
==========
**XlsxWriter** is a Python module for writing files in the Excel 2007+ XLSX
file format.
XlsxWriter can be used to write text, numbers, formulas and hyperlinks to
multiple worksheets and it supports features such as formatting and many more,
including:
* 100% compatible Excel XLSX files.
* Full formatting.
* Merged cells.
* Defined names.
* Charts.
* Autofilters.
* Data validation and drop down lists.
* Conditional formatting.
* Worksheet PNG/JPEG/BMP/WMF/EMF images.
* Rich multi-format strings.
* Cell comments.
* Integration with Pandas.
* Textboxes.
* Support for adding Macros.
* Memory optimization mode for writing large files.
It supports Python 2.7, 3.4+ and PyPy and uses standard libraries only.
Here is a simple example:
.. code-block:: python
import xlsxwriter
# Create an new Excel file and add a worksheet.
workbook = xlsxwriter.Workbook('demo.xlsx')
worksheet = workbook.add_worksheet()
# Widen the first column to make the text clearer.
worksheet.set_column('A:A', 20)
# Add a bold format to use to highlight cells.
bold = workbook.add_format({'bold': True})
# Write some simple text.
worksheet.write('A1', 'Hello')
# Text with formatting.
worksheet.write('A2', 'World', bold)
# Write some numbers, with row/column notation.
worksheet.write(2, 0, 123)
worksheet.write(3, 0, 123.456)
# Insert an image.
worksheet.insert_image('B5', 'logo.png')
workbook.close()
.. image:: https://raw.github.com/jmcnamara/XlsxWriter/master/dev/docs/source/_images/demo.png
See the full documentation at: https://xlsxwriter.readthedocs.io
Release notes: https://xlsxwriter.readthedocs.io/changes.html

View File

@ -1,75 +0,0 @@
../../Scripts/__pycache__/vba_extract.cpython-39.pyc,,
../../Scripts/vba_extract.py,sha256=ybSlVCrPQNj9jca6_STPfn_lrnYssbWkRWbPPJ-Fo70,1803
XlsxWriter-1.3.7.dist-info/INSTALLER,sha256=zuuue4knoyJ-UwPPXg8fezS7VCrXJQrAP7zeNuwvFQg,4
XlsxWriter-1.3.7.dist-info/LICENSE.txt,sha256=QfBUROEyEBI37cb1D4fVGRoPeCoGazRUtrmlZRp5dAQ,1539
XlsxWriter-1.3.7.dist-info/METADATA,sha256=pU82WHDIY_hZ5EcsjSYb8kaWaAq5NblvyJeDYZnm7oc,2545
XlsxWriter-1.3.7.dist-info/RECORD,,
XlsxWriter-1.3.7.dist-info/REQUESTED,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
XlsxWriter-1.3.7.dist-info/WHEEL,sha256=8zNYZbwQSXoB9IfXOjPfeNwvAsALAjffgk27FqvCWbo,110
XlsxWriter-1.3.7.dist-info/top_level.txt,sha256=xFEUcpcY7QzAr_-ztkXebEt1FZCxpuCbmyCsBaevhRk,11
xlsxwriter/__init__.py,sha256=-UxiTpQgrE7TE53RRazPVADQawLwWfBnnMsGENfheRI,79
xlsxwriter/__pycache__/__init__.cpython-39.pyc,,
xlsxwriter/__pycache__/app.cpython-39.pyc,,
xlsxwriter/__pycache__/chart.cpython-39.pyc,,
xlsxwriter/__pycache__/chart_area.cpython-39.pyc,,
xlsxwriter/__pycache__/chart_bar.cpython-39.pyc,,
xlsxwriter/__pycache__/chart_column.cpython-39.pyc,,
xlsxwriter/__pycache__/chart_doughnut.cpython-39.pyc,,
xlsxwriter/__pycache__/chart_line.cpython-39.pyc,,
xlsxwriter/__pycache__/chart_pie.cpython-39.pyc,,
xlsxwriter/__pycache__/chart_radar.cpython-39.pyc,,
xlsxwriter/__pycache__/chart_scatter.cpython-39.pyc,,
xlsxwriter/__pycache__/chart_stock.cpython-39.pyc,,
xlsxwriter/__pycache__/chartsheet.cpython-39.pyc,,
xlsxwriter/__pycache__/comments.cpython-39.pyc,,
xlsxwriter/__pycache__/compatibility.cpython-39.pyc,,
xlsxwriter/__pycache__/contenttypes.cpython-39.pyc,,
xlsxwriter/__pycache__/core.cpython-39.pyc,,
xlsxwriter/__pycache__/custom.cpython-39.pyc,,
xlsxwriter/__pycache__/drawing.cpython-39.pyc,,
xlsxwriter/__pycache__/exceptions.cpython-39.pyc,,
xlsxwriter/__pycache__/format.cpython-39.pyc,,
xlsxwriter/__pycache__/packager.cpython-39.pyc,,
xlsxwriter/__pycache__/relationships.cpython-39.pyc,,
xlsxwriter/__pycache__/shape.cpython-39.pyc,,
xlsxwriter/__pycache__/sharedstrings.cpython-39.pyc,,
xlsxwriter/__pycache__/styles.cpython-39.pyc,,
xlsxwriter/__pycache__/table.cpython-39.pyc,,
xlsxwriter/__pycache__/theme.cpython-39.pyc,,
xlsxwriter/__pycache__/utility.cpython-39.pyc,,
xlsxwriter/__pycache__/vml.cpython-39.pyc,,
xlsxwriter/__pycache__/workbook.cpython-39.pyc,,
xlsxwriter/__pycache__/worksheet.cpython-39.pyc,,
xlsxwriter/__pycache__/xmlwriter.cpython-39.pyc,,
xlsxwriter/app.py,sha256=HIn4NhmQ6tVxlKZsEfaxRFJvU5axww15ri-pvAX71so,5739
xlsxwriter/chart.py,sha256=kdXviOVlDeWdRUbOCTyTgIRBN8bdlAuPvBLyEdOB6Ak,126595
xlsxwriter/chart_area.py,sha256=PgcBwTn95KZ1e2xV6hJXhqESfDTDy7npXL0pGO0jPP8,2636
xlsxwriter/chart_bar.py,sha256=E5Sy-CaUexVKOQP1stAUS9TQ23267kMfUztxouzZtBs,4794
xlsxwriter/chart_column.py,sha256=DsZgL7-suMfnaU-5T-rDhJMif_eQk4TkW5wKmtP_p5A,3545
xlsxwriter/chart_doughnut.py,sha256=5P_VFwtZsNxA0F57wO12ruKKeIZ03t2zu1D5IWnzua8,2667
xlsxwriter/chart_line.py,sha256=Vcy75dXpMIh1Wo0oF1o6lnL13SNgLVPTkWgu8VA7nAY,3707
xlsxwriter/chart_pie.py,sha256=VpRU1nCKFg72oSBttvRM98NjopaO4_pMIhDcTV3_xnY,7015
xlsxwriter/chart_radar.py,sha256=uMmLH65izULXMqVr-7E0OASt6FNgwwO5BOdiOz_ZVLM,2699
xlsxwriter/chart_scatter.py,sha256=7i-96AIOCXMf8wf9u47BRbV2RLPBq9mczImEJMFUA6E,9614
xlsxwriter/chart_stock.py,sha256=feNy9MFwMuYrRkKAmT_Cr-1Xq5Fr-DtYz8lMTY6Me48,3550
xlsxwriter/chartsheet.py,sha256=e-06NG8O-DlpFtGdcFfG_rpcXIsZIs1cH6gV2IRz7RI,5562
xlsxwriter/comments.py,sha256=aqb4GsTN69NAz7qxlCUfTn0OI6Dj50ZW1KKKv9hxaqg,5553
xlsxwriter/compatibility.py,sha256=ssiZTXRkBgXd-OK7t8kGhbTV71sbh0CZHhQCXnnaw4s,880
xlsxwriter/contenttypes.py,sha256=a5RUIAsYNRUJajIEqbBJMvCOK1w-VC5kQ1jLaN7ixMU,7095
xlsxwriter/core.py,sha256=u0Yq91Dq5ybLicRV8vMaFc0R5aiibbMeJ0oRPumeedc,5466
xlsxwriter/custom.py,sha256=OujBl7yRZ528uQNGyAhVUWZbsbkkqDgaMZHv16zm54E,3789
xlsxwriter/drawing.py,sha256=GWIwZ99i7nxeEUOdkv0vbvpxybL77VfBIbrEaKYZVc4,33463
xlsxwriter/exceptions.py,sha256=lonqH62Zf2IdPS0xm3Td9cP27uVwpqMvur-p1UHV9Ys,1246
xlsxwriter/format.py,sha256=hyIxEVs10WdMqe2fC0OqRdpsh2QLa-rT6Pm4D6_ZSsg,26229
xlsxwriter/packager.py,sha256=xiYy6UA6offDO5fwZp2hg2h4h7d-8_Cbh0yaFdBt-FQ,23388
xlsxwriter/relationships.py,sha256=CrMRcrNbx1AL2tE64wmJC5ZLhb3MOoR1h2GUhy-_Roc,3422
xlsxwriter/shape.py,sha256=5XYstsJcesJQQYCgBt5Nw768Fed1PXajfR9S2OhK-9g,12855
xlsxwriter/sharedstrings.py,sha256=iLieJCpggXgHn3paaaWW5GvZhss8s4Ds58qfcohkipk,4926
xlsxwriter/styles.py,sha256=WcF3ShEMQheTEooZandD0DGE75DULSvRTmvb9GFd4k8,22998
xlsxwriter/table.py,sha256=D7SzMgKApBWSeQBrlnXa5HfWk1A3z-AjjJjVKQjT5ks,5350
xlsxwriter/theme.py,sha256=rcEb43K8_qExpDhjXvIq-AbYTM_A6nvhSbVHIo32uoU,9011
xlsxwriter/utility.py,sha256=CIrWatwVqbnnQ8_KuhScWkXVtSb2CRa07xbExlLEYSg,26015
xlsxwriter/vml.py,sha256=G8luf69ZdJEUKVYhuVRrbeOR6EuLpFu-qByj7KadeQo,19385
xlsxwriter/workbook.py,sha256=-4bLyfDvflxPQgfKk_LAvpV1pEIGvb_Mvj5w2gTmtTU,61525
xlsxwriter/worksheet.py,sha256=fQR71lKhtBAS_pE1YtNLb6KWl7W0Z5wfwmLMZJ4fksA,255039
xlsxwriter/xmlwriter.py,sha256=6P5dc8-neXMD0fIrzuYyLTJn3awv_Fy2gKTb5L0fLi4,6912

View File

@ -1,6 +0,0 @@
Wheel-Version: 1.0
Generator: bdist_wheel (0.33.6)
Root-Is-Purelib: true
Tag: py2-none-any
Tag: py3-none-any

View File

@ -1,123 +0,0 @@
import sys
import os
import re
import importlib
import warnings
is_pypy = '__pypy__' in sys.builtin_module_names
def warn_distutils_present():
if 'distutils' not in sys.modules:
return
if is_pypy and sys.version_info < (3, 7):
# PyPy for 3.6 unconditionally imports distutils, so bypass the warning
# https://foss.heptapod.net/pypy/pypy/-/blob/be829135bc0d758997b3566062999ee8b23872b4/lib-python/3/site.py#L250
return
warnings.warn(
"Distutils was imported before Setuptools, but importing Setuptools "
"also replaces the `distutils` module in `sys.modules`. This may lead "
"to undesirable behaviors or errors. To avoid these issues, avoid "
"using distutils directly, ensure that setuptools is installed in the "
"traditional way (e.g. not an editable install), and/or make sure "
"that setuptools is always imported before distutils.")
def clear_distutils():
if 'distutils' not in sys.modules:
return
warnings.warn("Setuptools is replacing distutils.")
mods = [name for name in sys.modules if re.match(r'distutils\b', name)]
for name in mods:
del sys.modules[name]
def enabled():
"""
Allow selection of distutils by environment variable.
"""
which = os.environ.get('SETUPTOOLS_USE_DISTUTILS', 'stdlib')
return which == 'local'
def ensure_local_distutils():
clear_distutils()
distutils = importlib.import_module('setuptools._distutils')
distutils.__name__ = 'distutils'
sys.modules['distutils'] = distutils
# sanity check that submodules load as expected
core = importlib.import_module('distutils.core')
assert '_distutils' in core.__file__, core.__file__
def do_override():
"""
Ensure that the local copy of distutils is preferred over stdlib.
See https://github.com/pypa/setuptools/issues/417#issuecomment-392298401
for more motivation.
"""
if enabled():
warn_distutils_present()
ensure_local_distutils()
class DistutilsMetaFinder:
def find_spec(self, fullname, path, target=None):
if path is not None:
return
method_name = 'spec_for_{fullname}'.format(**locals())
method = getattr(self, method_name, lambda: None)
return method()
def spec_for_distutils(self):
import importlib.abc
import importlib.util
class DistutilsLoader(importlib.abc.Loader):
def create_module(self, spec):
return importlib.import_module('setuptools._distutils')
def exec_module(self, module):
pass
return importlib.util.spec_from_loader('distutils', DistutilsLoader())
def spec_for_pip(self):
"""
Ensure stdlib distutils when running under pip.
See pypa/pip#8761 for rationale.
"""
if self.pip_imported_during_build():
return
clear_distutils()
self.spec_for_distutils = lambda: None
@staticmethod
def pip_imported_during_build():
"""
Detect if pip is being imported in a build script. Ref #2355.
"""
import traceback
return any(
frame.f_globals['__file__'].endswith('setup.py')
for frame, line in traceback.walk_stack(None)
)
DISTUTILS_FINDER = DistutilsMetaFinder()
def add_shim():
sys.meta_path.insert(0, DISTUTILS_FINDER)
def remove_shim():
try:
sys.meta_path.remove(DISTUTILS_FINDER)
except ValueError:
pass

View File

@ -1 +0,0 @@
__import__('_distutils_hack').do_override()

View File

@ -1 +0,0 @@
import os; var = 'SETUPTOOLS_USE_DISTUTILS'; enabled = os.environ.get(var, 'stdlib') == 'local'; enabled and __import__('_distutils_hack').add_shim();

View File

@ -1,5 +0,0 @@
"""Run the EasyInstall command"""
if __name__ == '__main__':
from setuptools.command.easy_install import main
main()

View File

@ -1,35 +0,0 @@
Metadata-Version: 1.1
Name: et-xmlfile
Version: 1.0.1
Summary: An implementation of lxml.xmlfile for the standard library
Home-page: https://bitbucket.org/openpyxl/et_xmlfile
Author: See ATUHORS.txt
Author-email: charlie.clark@clark-consulting.eu
License: MIT
Description: et_xmfile
=========
et_xmlfile is a low memory library for creating large XML files.
It is based upon the `xmlfile module from lxml <http://lxml.de/api.html#incremental-xml-generation>`_ with the aim of allowing code to be developed that will work with both libraries. It was developed initially for the openpyxl project but is now a standalone module.
The code was written by Elias Rabel as part of the `Python Düsseldorf <http://pyddf.de>`_ openpyxl sprint in September 2014.
Note on performance
-------------------
The code was not developed with performance in mind but turned out to be faster than the existing SAX-based implementation but is significantly slower than lxml's xmlfile. There is one area where an optimisation for lxml will negatively affect the performance of et_xmfile and that is when using the `.element()` method on an xmlfile context manager. It is, therefore, recommended not to use this, though the method is provided for code compatibility.
Platform: UNKNOWN
Classifier: Development Status :: 5 - Production/Stable
Classifier: Operating System :: MacOS :: MacOS X
Classifier: Operating System :: Microsoft :: Windows
Classifier: Operating System :: POSIX
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 2.6
Classifier: Programming Language :: Python :: 2.7
Classifier: Programming Language :: Python :: 3.3
Classifier: Programming Language :: Python :: 3.4
Requires: python (>=2.6.0)

View File

@ -1,14 +0,0 @@
MANIFEST.in
README.rst
setup.cfg
setup.py
et_xmlfile/__init__.py
et_xmlfile/xmlfile.py
et_xmlfile.egg-info/PKG-INFO
et_xmlfile.egg-info/SOURCES.txt
et_xmlfile.egg-info/dependency_links.txt
et_xmlfile.egg-info/top_level.txt
et_xmlfile/tests/__init__.py
et_xmlfile/tests/common_imports.py
et_xmlfile/tests/helper.py
et_xmlfile/tests/test_incremental_xmlfile.py

View File

@ -1,16 +0,0 @@
..\et_xmlfile\__init__.py
..\et_xmlfile\__pycache__\__init__.cpython-39.pyc
..\et_xmlfile\__pycache__\xmlfile.cpython-39.pyc
..\et_xmlfile\tests\__init__.py
..\et_xmlfile\tests\__pycache__\__init__.cpython-39.pyc
..\et_xmlfile\tests\__pycache__\common_imports.cpython-39.pyc
..\et_xmlfile\tests\__pycache__\helper.cpython-39.pyc
..\et_xmlfile\tests\__pycache__\test_incremental_xmlfile.cpython-39.pyc
..\et_xmlfile\tests\common_imports.py
..\et_xmlfile\tests\helper.py
..\et_xmlfile\tests\test_incremental_xmlfile.py
..\et_xmlfile\xmlfile.py
PKG-INFO
SOURCES.txt
dependency_links.txt
top_level.txt

View File

@ -1,12 +0,0 @@
from __future__ import absolute_import
from .xmlfile import xmlfile
# constants
__version__ = '1.0.1'
__author__ = 'See ATUHORS.txt'
__license__ = 'MIT'
__author_email__ = 'charlie.clark@clark-consulting.eu'
__url__ = 'https://bitbucket.org/openpyxl/et_xmlfile'

View File

@ -1,305 +0,0 @@
import os
import os.path
import re
import gc
import sys
import unittest
try:
import urlparse
except ImportError:
import urllib.parse as urlparse
try:
from urllib import pathname2url
except:
from urllib.request import pathname2url
def make_version_tuple(version_string):
l = []
for part in re.findall('([0-9]+|[^0-9.]+)', version_string):
try:
l.append(int(part))
except ValueError:
l.append(part)
return tuple(l)
IS_PYPY = (getattr(sys, 'implementation', None) == 'pypy' or
getattr(sys, 'pypy_version_info', None) is not None)
IS_PYTHON3 = sys.version_info[0] >= 3
try:
from xml.etree import ElementTree # Python 2.5+
except ImportError:
try:
from elementtree import ElementTree # standard ET
except ImportError:
ElementTree = None
if hasattr(ElementTree, 'VERSION'):
ET_VERSION = make_version_tuple(ElementTree.VERSION)
else:
ET_VERSION = (0,0,0)
try:
from xml.etree import cElementTree # Python 2.5+
except ImportError:
try:
import cElementTree # standard ET
except ImportError:
cElementTree = None
if hasattr(cElementTree, 'VERSION'):
CET_VERSION = make_version_tuple(cElementTree.VERSION)
else:
CET_VERSION = (0,0,0)
def filter_by_version(test_class, version_dict, current_version):
"""Remove test methods that do not work with the current lib version.
"""
find_required_version = version_dict.get
def dummy_test_method(self):
pass
for name in dir(test_class):
expected_version = find_required_version(name, (0,0,0))
if expected_version > current_version:
setattr(test_class, name, dummy_test_method)
try:
import doctest
# check if the system version has everything we need
doctest.DocFileSuite
doctest.DocTestParser
doctest.NORMALIZE_WHITESPACE
doctest.ELLIPSIS
except (ImportError, AttributeError):
# we need our own version to make it work (Python 2.3?)
import local_doctest as doctest
try:
sorted
except NameError:
def sorted(seq, **kwargs):
seq = list(seq)
seq.sort(**kwargs)
return seq
else:
locals()['sorted'] = sorted
try:
next
except NameError:
def next(it):
return it.next()
else:
locals()['next'] = next
try:
import pytest
except ImportError:
class skipif(object):
"Using a class because a function would bind into a method when used in classes"
def __init__(self, *args): pass
def __call__(self, func, *args): return func
else:
skipif = pytest.mark.skipif
def _get_caller_relative_path(filename, frame_depth=2):
module = sys.modules[sys._getframe(frame_depth).f_globals['__name__']]
return os.path.normpath(os.path.join(
os.path.dirname(getattr(module, '__file__', '')), filename))
from io import StringIO
if sys.version_info[0] >= 3:
# Python 3
from builtins import str as unicode
def _str(s, encoding="UTF-8"):
return s
def _bytes(s, encoding="UTF-8"):
return s.encode(encoding)
from io import BytesIO as _BytesIO
def BytesIO(*args):
if args and isinstance(args[0], str):
args = (args[0].encode("UTF-8"),)
return _BytesIO(*args)
doctest_parser = doctest.DocTestParser()
_fix_unicode = re.compile(r'(\s+)u(["\'])').sub
_fix_exceptions = re.compile(r'(.*except [^(]*),\s*(.*:)').sub
def make_doctest(filename):
filename = _get_caller_relative_path(filename)
doctests = read_file(filename)
doctests = _fix_unicode(r'\1\2', doctests)
doctests = _fix_exceptions(r'\1 as \2', doctests)
return doctest.DocTestCase(
doctest_parser.get_doctest(
doctests, {}, os.path.basename(filename), filename, 0))
else:
# Python 2
from __builtin__ import unicode
def _str(s, encoding="UTF-8"):
return unicode(s, encoding=encoding)
def _bytes(s, encoding="UTF-8"):
return s
from io import BytesIO
doctest_parser = doctest.DocTestParser()
_fix_traceback = re.compile(r'^(\s*)(?:\w+\.)+(\w*(?:Error|Exception|Invalid):)', re.M).sub
_fix_exceptions = re.compile(r'(.*except [^(]*)\s+as\s+(.*:)').sub
_fix_bytes = re.compile(r'(\s+)b(["\'])').sub
def make_doctest(filename):
filename = _get_caller_relative_path(filename)
doctests = read_file(filename)
doctests = _fix_traceback(r'\1\2', doctests)
doctests = _fix_exceptions(r'\1, \2', doctests)
doctests = _fix_bytes(r'\1\2', doctests)
return doctest.DocTestCase(
doctest_parser.get_doctest(
doctests, {}, os.path.basename(filename), filename, 0))
try:
skipIf = unittest.skipIf
except AttributeError:
def skipIf(condition, why,
_skip=lambda test_method: None,
_keep=lambda test_method: test_method):
if condition:
return _skip
return _keep
class HelperTestCase(unittest.TestCase):
def tearDown(self):
gc.collect()
def parse(self, text, parser=None):
f = BytesIO(text) if isinstance(text, bytes) else StringIO(text)
return etree.parse(f, parser=parser)
def _rootstring(self, tree):
return etree.tostring(tree.getroot()).replace(
_bytes(' '), _bytes('')).replace(_bytes('\n'), _bytes(''))
# assertFalse doesn't exist in Python 2.3
try:
unittest.TestCase.assertFalse
except AttributeError:
assertFalse = unittest.TestCase.failIf
class SillyFileLike:
def __init__(self, xml_data=_bytes('<foo><bar/></foo>')):
self.xml_data = xml_data
def read(self, amount=None):
if self.xml_data:
if amount:
data = self.xml_data[:amount]
self.xml_data = self.xml_data[amount:]
else:
data = self.xml_data
self.xml_data = _bytes('')
return data
return _bytes('')
class LargeFileLike:
def __init__(self, charlen=100, depth=4, children=5):
self.data = BytesIO()
self.chars = _bytes('a') * charlen
self.children = range(children)
self.more = self.iterelements(depth)
def iterelements(self, depth):
yield _bytes('<root>')
depth -= 1
if depth > 0:
for child in self.children:
for element in self.iterelements(depth):
yield element
yield self.chars
else:
yield self.chars
yield _bytes('</root>')
def read(self, amount=None):
data = self.data
append = data.write
if amount:
for element in self.more:
append(element)
if data.tell() >= amount:
break
else:
for element in self.more:
append(element)
result = data.getvalue()
data.seek(0)
data.truncate()
if amount:
append(result[amount:])
result = result[:amount]
return result
class LargeFileLikeUnicode(LargeFileLike):
def __init__(self, charlen=100, depth=4, children=5):
LargeFileLike.__init__(self, charlen, depth, children)
self.data = StringIO()
self.chars = _str('a') * charlen
self.more = self.iterelements(depth)
def iterelements(self, depth):
yield _str('<root>')
depth -= 1
if depth > 0:
for child in self.children:
for element in self.iterelements(depth):
yield element
yield self.chars
else:
yield self.chars
yield _str('</root>')
def fileInTestDir(name):
_testdir = os.path.dirname(__file__)
return os.path.join(_testdir, name)
def path2url(path):
return urlparse.urljoin(
'file:', pathname2url(path))
def fileUrlInTestDir(name):
return path2url(fileInTestDir(name))
def read_file(name, mode='r'):
f = open(name, mode)
try:
data = f.read()
finally:
f.close()
return data
def write_to_file(name, data, mode='w'):
f = open(name, mode)
try:
data = f.write(data)
finally:
f.close()
def readFileInTestDir(name, mode='r'):
return read_file(fileInTestDir(name), mode)
def canonicalize(xml):
tree = etree.parse(BytesIO(xml) if isinstance(xml, bytes) else StringIO(xml))
f = BytesIO()
tree.write_c14n(f)
return f.getvalue()
def unentitify(xml):
for entity_name, value in re.findall("(&#([0-9]+);)", xml):
xml = xml.replace(entity_name, unichr(int(value)))
return xml

View File

@ -1,21 +0,0 @@
from __future__ import absolute_import
# Copyright (c) 2010-2015 openpyxl
# Python stdlib imports
from lxml.doctestcompare import LXMLOutputChecker, PARSE_XML
def compare_xml(generated, expected):
"""Use doctest checking from lxml for comparing XML trees. Returns diff if the two are not the same"""
checker = LXMLOutputChecker()
class DummyDocTest():
pass
ob = DummyDocTest()
ob.want = expected
check = checker.check_output(expected, generated, PARSE_XML)
if check is False:
diff = checker.output_difference(ob, generated, PARSE_XML)
return diff

View File

@ -1,356 +0,0 @@
from __future__ import absolute_import
"""
Tests for the incremental XML serialisation API.
Adapted from the tests from lxml.etree.xmlfile
"""
try:
import lxml
except ImportError:
raise ImportError("lxml is required to run the tests.")
from io import BytesIO
import unittest
import tempfile, os, sys
from .common_imports import HelperTestCase, skipIf
from et_xmlfile import xmlfile
from et_xmlfile.xmlfile import LxmlSyntaxError
import pytest
from .helper import compare_xml
import xml.etree.ElementTree
from xml.etree.ElementTree import Element, parse
class _XmlFileTestCaseBase(HelperTestCase):
_file = None # to be set by specific subtypes below
def setUp(self):
self._file = BytesIO()
def test_element(self):
with xmlfile(self._file) as xf:
with xf.element('test'):
pass
self.assertXml('<test></test>')
def test_element_write_text(self):
with xmlfile(self._file) as xf:
with xf.element('test'):
xf.write('toast')
self.assertXml('<test>toast</test>')
def test_element_nested(self):
with xmlfile(self._file) as xf:
with xf.element('test'):
with xf.element('toast'):
with xf.element('taste'):
xf.write('conTent')
self.assertXml('<test><toast><taste>conTent</taste></toast></test>')
def test_element_nested_with_text(self):
with xmlfile(self._file) as xf:
with xf.element('test'):
xf.write('con')
with xf.element('toast'):
xf.write('tent')
with xf.element('taste'):
xf.write('inside')
xf.write('tnet')
xf.write('noc')
self.assertXml('<test>con<toast>tent<taste>inside</taste>'
'tnet</toast>noc</test>')
def test_write_Element(self):
with xmlfile(self._file) as xf:
xf.write(Element('test'))
self.assertXml('<test/>')
def test_write_Element_repeatedly(self):
element = Element('test')
with xmlfile(self._file) as xf:
with xf.element('test'):
for i in range(100):
xf.write(element)
tree = self._parse_file()
self.assertTrue(tree is not None)
self.assertEqual(100, len(tree.getroot()))
self.assertEqual(set(['test']), set(el.tag for el in tree.getroot()))
def test_namespace_nsmap(self):
with xmlfile(self._file) as xf:
with xf.element('{nsURI}test', nsmap={'x': 'nsURI'}):
pass
self.assertXml('<x:test xmlns:x="nsURI"></x:test>')
def test_namespace_nested_nsmap(self):
with xmlfile(self._file) as xf:
with xf.element('test', nsmap={'x': 'nsURI'}):
with xf.element('{nsURI}toast'):
pass
self.assertXml('<test xmlns:x="nsURI"><x:toast></x:toast></test>')
def test_anonymous_namespace(self):
with xmlfile(self._file) as xf:
with xf.element('{nsURI}test'):
pass
self.assertXml('<ns0:test xmlns:ns0="nsURI"></ns0:test>')
def test_namespace_nested_anonymous(self):
with xmlfile(self._file) as xf:
with xf.element('test'):
with xf.element('{nsURI}toast'):
pass
self.assertXml('<test><ns0:toast xmlns:ns0="nsURI"></ns0:toast></test>')
def test_default_namespace(self):
with xmlfile(self._file) as xf:
with xf.element('{nsURI}test', nsmap={None: 'nsURI'}):
pass
self.assertXml('<test xmlns="nsURI"></test>')
def test_nested_default_namespace(self):
with xmlfile(self._file) as xf:
with xf.element('{nsURI}test', nsmap={None: 'nsURI'}):
with xf.element('{nsURI}toast'):
pass
self.assertXml('<test xmlns="nsURI"><toast></toast></test>')
@pytest.mark.xfail
def test_pi(self):
from et_xmlfile.xmlfile import ProcessingInstruction
with xmlfile(self._file) as xf:
xf.write(ProcessingInstruction('pypi'))
with xf.element('test'):
pass
self.assertXml('<?pypi ?><test></test>')
@pytest.mark.xfail
def test_comment(self):
with xmlfile(self._file) as xf:
xf.write(etree.Comment('a comment'))
with xf.element('test'):
pass
self.assertXml('<!--a comment--><test></test>')
def test_attribute(self):
with xmlfile(self._file) as xf:
with xf.element('test', attrib={'k': 'v'}):
pass
self.assertXml('<test k="v"></test>')
def test_escaping(self):
with xmlfile(self._file) as xf:
with xf.element('test'):
xf.write('Comments: <!-- text -->\n')
xf.write('Entities: &amp;')
self.assertXml(
'<test>Comments: &lt;!-- text --&gt;\nEntities: &amp;amp;</test>')
@pytest.mark.xfail
def test_encoding(self):
with xmlfile(self._file, encoding='utf16') as xf:
with xf.element('test'):
xf.write('toast')
self.assertXml('<test>toast</test>', encoding='utf16')
@pytest.mark.xfail
def test_buffering(self):
with xmlfile(self._file, buffered=False) as xf:
with xf.element('test'):
self.assertXml("<test>")
xf.write('toast')
self.assertXml("<test>toast")
with xf.element('taste'):
self.assertXml("<test>toast<taste>")
xf.write('some', etree.Element("more"), "toast")
self.assertXml("<test>toast<taste>some<more/>toast")
self.assertXml("<test>toast<taste>some<more/>toast</taste>")
xf.write('end')
self.assertXml("<test>toast<taste>some<more/>toast</taste>end")
self.assertXml("<test>toast<taste>some<more/>toast</taste>end</test>")
self.assertXml("<test>toast<taste>some<more/>toast</taste>end</test>")
@pytest.mark.xfail
def test_flush(self):
with xmlfile(self._file, buffered=True) as xf:
with xf.element('test'):
self.assertXml("")
xf.write('toast')
self.assertXml("")
with xf.element('taste'):
self.assertXml("")
xf.flush()
self.assertXml("<test>toast<taste>")
self.assertXml("<test>toast<taste>")
self.assertXml("<test>toast<taste>")
self.assertXml("<test>toast<taste></taste></test>")
def test_failure_preceding_text(self):
try:
with xmlfile(self._file) as xf:
xf.write('toast')
except LxmlSyntaxError:
self.assertTrue(True)
else:
self.assertTrue(False)
def test_failure_trailing_text(self):
with xmlfile(self._file) as xf:
with xf.element('test'):
pass
try:
xf.write('toast')
except LxmlSyntaxError:
self.assertTrue(True)
else:
self.assertTrue(False)
def test_failure_trailing_Element(self):
with xmlfile(self._file) as xf:
with xf.element('test'):
pass
try:
xf.write(Element('test'))
except LxmlSyntaxError:
self.assertTrue(True)
else:
self.assertTrue(False)
@pytest.mark.xfail
def test_closing_out_of_order_in_error_case(self):
cm_exit = None
try:
with xmlfile(self._file) as xf:
x = xf.element('test')
cm_exit = x.__exit__
x.__enter__()
raise ValueError('123')
except ValueError:
self.assertTrue(cm_exit)
try:
cm_exit(ValueError, ValueError("huhu"), None)
except LxmlSyntaxError:
self.assertTrue(True)
else:
self.assertTrue(False)
else:
self.assertTrue(False)
def _read_file(self):
pos = self._file.tell()
self._file.seek(0)
try:
return self._file.read()
finally:
self._file.seek(pos)
def _parse_file(self):
pos = self._file.tell()
self._file.seek(0)
try:
return parse(self._file)
finally:
self._file.seek(pos)
def tearDown(self):
if self._file is not None:
self._file.close()
def assertXml(self, expected, encoding='utf8'):
diff = compare_xml(self._read_file().decode(encoding), expected)
assert diff is None, diff
class BytesIOXmlFileTestCase(_XmlFileTestCaseBase):
def setUp(self):
self._file = BytesIO()
def test_filelike_close(self):
with xmlfile(self._file, close=True) as xf:
with xf.element('test'):
pass
self.assertRaises(ValueError, self._file.getvalue)
class TempXmlFileTestCase(_XmlFileTestCaseBase):
def setUp(self):
self._file = tempfile.TemporaryFile()
class TempPathXmlFileTestCase(_XmlFileTestCaseBase):
def setUp(self):
self._tmpfile = tempfile.NamedTemporaryFile(delete=False)
self._file = self._tmpfile.name
def tearDown(self):
try:
self._tmpfile.close()
finally:
if os.path.exists(self._tmpfile.name):
os.unlink(self._tmpfile.name)
def _read_file(self):
self._tmpfile.seek(0)
return self._tmpfile.read()
def _parse_file(self):
self._tmpfile.seek(0)
return parse(self._tmpfile)
@skipIf(True, "temp file behaviour is too platform specific here")
def test_buffering(self):
pass
@skipIf(True, "temp file behaviour is too platform specific here")
def test_flush(self):
pass
class SimpleFileLikeXmlFileTestCase(_XmlFileTestCaseBase):
class SimpleFileLike(object):
def __init__(self, target):
self._target = target
self.write = target.write
self.tell = target.tell
self.seek = target.seek
self.closed = False
def close(self):
assert not self.closed
self.closed = True
self._target.close()
def setUp(self):
self._target = BytesIO()
self._file = self.SimpleFileLike(self._target)
def _read_file(self):
return self._target.getvalue()
def _parse_file(self):
pos = self._file.tell()
self._target.seek(0)
try:
return parse(self._target)
finally:
self._target.seek(pos)
def test_filelike_not_closing(self):
with xmlfile(self._file) as xf:
with xf.element('test'):
pass
self.assertFalse(self._file.closed)
def test_filelike_close(self):
with xmlfile(self._file, close=True) as xf:
with xf.element('test'):
pass
self.assertTrue(self._file.closed)
self._file = None # prevent closing in tearDown()

View File

@ -1,104 +0,0 @@
from __future__ import absolute_import
# Copyright (c) 2010-2015 openpyxl
"""Implements the lxml.etree.xmlfile API using the standard library xml.etree"""
from contextlib import contextmanager
from xml.etree.ElementTree import Element, tostring
class LxmlSyntaxError(Exception):
pass
class _FakeIncrementalFileWriter(object):
"""Replacement for _IncrementalFileWriter of lxml.
Uses ElementTree to build xml in memory."""
def __init__(self, output_file):
self._element_stack = []
self._top_element = None
self._file = output_file
self._have_root = False
@contextmanager
def element(self, tag, attrib=None, nsmap=None, **_extra):
"""Create a new xml element using a context manager.
The elements are written when the top level context is left.
This is for code compatibility only as it is quite slow.
"""
# __enter__ part
self._have_root = True
if attrib is None:
attrib = {}
self._top_element = Element(tag, attrib=attrib, **_extra)
self._top_element.text = ''
self._top_element.tail = ''
self._element_stack.append(self._top_element)
yield
# __exit__ part
el = self._element_stack.pop()
if self._element_stack:
parent = self._element_stack[-1]
parent.append(self._top_element)
self._top_element = parent
else:
self._write_element(el)
self._top_element = None
def write(self, arg):
"""Write a string or subelement."""
if isinstance(arg, str):
# it is not allowed to write a string outside of an element
if self._top_element is None:
raise LxmlSyntaxError()
if len(self._top_element) == 0:
# element has no children: add string to text
self._top_element.text += arg
else:
# element has children: add string to tail of last child
self._top_element[-1].tail += arg
else:
if self._top_element is not None:
self._top_element.append(arg)
elif not self._have_root:
self._write_element(arg)
else:
raise LxmlSyntaxError()
def _write_element(self, element):
xml = tostring(element)
self._file.write(xml)
def __enter__(self):
pass
def __exit__(self, type, value, traceback):
# without root the xml document is incomplete
if not self._have_root:
raise LxmlSyntaxError()
class xmlfile(object):
"""Context manager that can replace lxml.etree.xmlfile."""
def __init__(self, output_file, buffered=False, encoding=None, close=False):
if isinstance(output_file, str):
self._file = open(output_file, 'wb')
self._close = True
else:
self._file = output_file
self._close = close
def __enter__(self):
return _FakeIncrementalFileWriter(self._file)
def __exit__(self, type, value, traceback):
if self._close == True:
self._file.close()

View File

@ -1,86 +0,0 @@
# __init__.py
# Copyright (C) 2008, 2009 Michael Trier (mtrier@gmail.com) and contributors
#
# This module is part of GitPython and is released under
# the BSD License: http://www.opensource.org/licenses/bsd-license.php
# flake8: noqa
#@PydevCodeAnalysisIgnore
import inspect
import os
import sys
import os.path as osp
__version__ = '3.1.12'
#{ Initialization
def _init_externals():
"""Initialize external projects by putting them into the path"""
if __version__ == '3.1.12' and 'PYOXIDIZER' not in os.environ:
sys.path.insert(1, osp.join(osp.dirname(__file__), 'ext', 'gitdb'))
try:
import gitdb
except ImportError as e:
raise ImportError("'gitdb' could not be found in your PYTHONPATH") from e
# END verify import
#} END initialization
#################
_init_externals()
#################
#{ Imports
from git.exc import * # @NoMove @IgnorePep8
try:
from git.config import GitConfigParser # @NoMove @IgnorePep8
from git.objects import * # @NoMove @IgnorePep8
from git.refs import * # @NoMove @IgnorePep8
from git.diff import * # @NoMove @IgnorePep8
from git.db import * # @NoMove @IgnorePep8
from git.cmd import Git # @NoMove @IgnorePep8
from git.repo import Repo # @NoMove @IgnorePep8
from git.remote import * # @NoMove @IgnorePep8
from git.index import * # @NoMove @IgnorePep8
from git.util import ( # @NoMove @IgnorePep8
LockFile,
BlockingLockFile,
Stats,
Actor,
rmtree,
)
except GitError as exc:
raise ImportError('%s: %s' % (exc.__class__.__name__, exc)) from exc
#} END imports
__all__ = [name for name, obj in locals().items()
if not (name.startswith('_') or inspect.ismodule(obj))]
#{ Initialize git executable path
GIT_OK = None
def refresh(path=None):
"""Convenience method for setting the git executable path."""
global GIT_OK
GIT_OK = False
if not Git.refresh(path=path):
return
if not FetchInfo.refresh():
return
GIT_OK = True
#} END initialize git executable path
#################
try:
refresh()
except Exception as exc:
raise ImportError('Failed to initialize: {0}'.format(exc)) from exc
#################

View File

@ -1,67 +0,0 @@
# -*- coding: utf-8 -*-
# config.py
# Copyright (C) 2008, 2009 Michael Trier (mtrier@gmail.com) and contributors
#
# This module is part of GitPython and is released under
# the BSD License: http://www.opensource.org/licenses/bsd-license.php
"""utilities to help provide compatibility with python 3"""
# flake8: noqa
import locale
import os
import sys
from gitdb.utils.encoding import (
force_bytes, # @UnusedImport
force_text # @UnusedImport
)
is_win = (os.name == 'nt')
is_posix = (os.name == 'posix')
is_darwin = (os.name == 'darwin')
defenc = sys.getfilesystemencoding()
def safe_decode(s):
"""Safely decodes a binary string to unicode"""
if isinstance(s, str):
return s
elif isinstance(s, bytes):
return s.decode(defenc, 'surrogateescape')
elif s is not None:
raise TypeError('Expected bytes or text, but got %r' % (s,))
def safe_encode(s):
"""Safely decodes a binary string to unicode"""
if isinstance(s, str):
return s.encode(defenc)
elif isinstance(s, bytes):
return s
elif s is not None:
raise TypeError('Expected bytes or text, but got %r' % (s,))
def win_encode(s):
"""Encode unicodes for process arguments on Windows."""
if isinstance(s, str):
return s.encode(locale.getpreferredencoding(False))
elif isinstance(s, bytes):
return s
elif s is not None:
raise TypeError('Expected bytes or text, but got %r' % (s,))
def with_metaclass(meta, *bases):
"""copied from https://github.com/Byron/bcore/blob/master/src/python/butility/future.py#L15"""
class metaclass(meta):
__call__ = type.__call__
__init__ = type.__init__
def __new__(cls, name, nbases, d):
if nbases is None:
return type.__new__(cls, name, (), d)
return meta(name, bases, d)
return metaclass(meta.__name__ + 'Helper', None, {})

View File

@ -1,794 +0,0 @@
# config.py
# Copyright (C) 2008, 2009 Michael Trier (mtrier@gmail.com) and contributors
#
# This module is part of GitPython and is released under
# the BSD License: http://www.opensource.org/licenses/bsd-license.php
"""Module containing module parser implementation able to properly read and write
configuration files"""
import abc
from functools import wraps
import inspect
from io import IOBase
import logging
import os
import re
import fnmatch
from collections import OrderedDict
from git.compat import (
defenc,
force_text,
with_metaclass,
is_win,
)
from git.util import LockFile
import os.path as osp
import configparser as cp
__all__ = ('GitConfigParser', 'SectionConstraint')
log = logging.getLogger('git.config')
log.addHandler(logging.NullHandler())
# invariants
# represents the configuration level of a configuration file
CONFIG_LEVELS = ("system", "user", "global", "repository")
# Section pattern to detect conditional includes.
# https://git-scm.com/docs/git-config#_conditional_includes
CONDITIONAL_INCLUDE_REGEXP = re.compile(r"(?<=includeIf )\"(gitdir|gitdir/i|onbranch):(.+)\"")
class MetaParserBuilder(abc.ABCMeta):
"""Utlity class wrapping base-class methods into decorators that assure read-only properties"""
def __new__(cls, name, bases, clsdict):
"""
Equip all base-class methods with a needs_values decorator, and all non-const methods
with a set_dirty_and_flush_changes decorator in addition to that."""
kmm = '_mutating_methods_'
if kmm in clsdict:
mutating_methods = clsdict[kmm]
for base in bases:
methods = (t for t in inspect.getmembers(base, inspect.isroutine) if not t[0].startswith("_"))
for name, method in methods:
if name in clsdict:
continue
method_with_values = needs_values(method)
if name in mutating_methods:
method_with_values = set_dirty_and_flush_changes(method_with_values)
# END mutating methods handling
clsdict[name] = method_with_values
# END for each name/method pair
# END for each base
# END if mutating methods configuration is set
new_type = super(MetaParserBuilder, cls).__new__(cls, name, bases, clsdict)
return new_type
def needs_values(func):
"""Returns method assuring we read values (on demand) before we try to access them"""
@wraps(func)
def assure_data_present(self, *args, **kwargs):
self.read()
return func(self, *args, **kwargs)
# END wrapper method
return assure_data_present
def set_dirty_and_flush_changes(non_const_func):
"""Return method that checks whether given non constant function may be called.
If so, the instance will be set dirty.
Additionally, we flush the changes right to disk"""
def flush_changes(self, *args, **kwargs):
rval = non_const_func(self, *args, **kwargs)
self._dirty = True
self.write()
return rval
# END wrapper method
flush_changes.__name__ = non_const_func.__name__
return flush_changes
class SectionConstraint(object):
"""Constrains a ConfigParser to only option commands which are constrained to
always use the section we have been initialized with.
It supports all ConfigParser methods that operate on an option.
:note:
If used as a context manager, will release the wrapped ConfigParser."""
__slots__ = ("_config", "_section_name")
_valid_attrs_ = ("get_value", "set_value", "get", "set", "getint", "getfloat", "getboolean", "has_option",
"remove_section", "remove_option", "options")
def __init__(self, config, section):
self._config = config
self._section_name = section
def __del__(self):
# Yes, for some reason, we have to call it explicitly for it to work in PY3 !
# Apparently __del__ doesn't get call anymore if refcount becomes 0
# Ridiculous ... .
self._config.release()
def __getattr__(self, attr):
if attr in self._valid_attrs_:
return lambda *args, **kwargs: self._call_config(attr, *args, **kwargs)
return super(SectionConstraint, self).__getattribute__(attr)
def _call_config(self, method, *args, **kwargs):
"""Call the configuration at the given method which must take a section name
as first argument"""
return getattr(self._config, method)(self._section_name, *args, **kwargs)
@property
def config(self):
"""return: Configparser instance we constrain"""
return self._config
def release(self):
"""Equivalent to GitConfigParser.release(), which is called on our underlying parser instance"""
return self._config.release()
def __enter__(self):
self._config.__enter__()
return self
def __exit__(self, exception_type, exception_value, traceback):
self._config.__exit__(exception_type, exception_value, traceback)
class _OMD(OrderedDict):
"""Ordered multi-dict."""
def __setitem__(self, key, value):
super(_OMD, self).__setitem__(key, [value])
def add(self, key, value):
if key not in self:
super(_OMD, self).__setitem__(key, [value])
return
super(_OMD, self).__getitem__(key).append(value)
def setall(self, key, values):
super(_OMD, self).__setitem__(key, values)
def __getitem__(self, key):
return super(_OMD, self).__getitem__(key)[-1]
def getlast(self, key):
return super(_OMD, self).__getitem__(key)[-1]
def setlast(self, key, value):
if key not in self:
super(_OMD, self).__setitem__(key, [value])
return
prior = super(_OMD, self).__getitem__(key)
prior[-1] = value
def get(self, key, default=None):
return super(_OMD, self).get(key, [default])[-1]
def getall(self, key):
return super(_OMD, self).__getitem__(key)
def items(self):
"""List of (key, last value for key)."""
return [(k, self[k]) for k in self]
def items_all(self):
"""List of (key, list of values for key)."""
return [(k, self.getall(k)) for k in self]
def get_config_path(config_level):
# we do not support an absolute path of the gitconfig on windows ,
# use the global config instead
if is_win and config_level == "system":
config_level = "global"
if config_level == "system":
return "/etc/gitconfig"
elif config_level == "user":
config_home = os.environ.get("XDG_CONFIG_HOME") or osp.join(os.environ.get("HOME", '~'), ".config")
return osp.normpath(osp.expanduser(osp.join(config_home, "git", "config")))
elif config_level == "global":
return osp.normpath(osp.expanduser("~/.gitconfig"))
elif config_level == "repository":
raise ValueError("No repo to get repository configuration from. Use Repo._get_config_path")
raise ValueError("Invalid configuration level: %r" % config_level)
class GitConfigParser(with_metaclass(MetaParserBuilder, cp.RawConfigParser, object)):
"""Implements specifics required to read git style configuration files.
This variation behaves much like the git.config command such that the configuration
will be read on demand based on the filepath given during initialization.
The changes will automatically be written once the instance goes out of scope, but
can be triggered manually as well.
The configuration file will be locked if you intend to change values preventing other
instances to write concurrently.
:note:
The config is case-sensitive even when queried, hence section and option names
must match perfectly.
If used as a context manager, will release the locked file."""
#{ Configuration
# The lock type determines the type of lock to use in new configuration readers.
# They must be compatible to the LockFile interface.
# A suitable alternative would be the BlockingLockFile
t_lock = LockFile
re_comment = re.compile(r'^\s*[#;]')
#} END configuration
optvalueonly_source = r'\s*(?P<option>[^:=\s][^:=]*)'
OPTVALUEONLY = re.compile(optvalueonly_source)
OPTCRE = re.compile(optvalueonly_source + r'\s*(?P<vi>[:=])\s*' + r'(?P<value>.*)$')
del optvalueonly_source
# list of RawConfigParser methods able to change the instance
_mutating_methods_ = ("add_section", "remove_section", "remove_option", "set")
def __init__(self, file_or_files=None, read_only=True, merge_includes=True, config_level=None, repo=None):
"""Initialize a configuration reader to read the given file_or_files and to
possibly allow changes to it by setting read_only False
:param file_or_files:
A single file path or file objects or multiple of these
:param read_only:
If True, the ConfigParser may only read the data , but not change it.
If False, only a single file path or file object may be given. We will write back the changes
when they happen, or when the ConfigParser is released. This will not happen if other
configuration files have been included
:param merge_includes: if True, we will read files mentioned in [include] sections and merge their
contents into ours. This makes it impossible to write back an individual configuration file.
Thus, if you want to modify a single configuration file, turn this off to leave the original
dataset unaltered when reading it.
:param repo: Reference to repository to use if [includeIf] sections are found in configuration files.
"""
cp.RawConfigParser.__init__(self, dict_type=_OMD)
# Used in python 3, needs to stay in sync with sections for underlying implementation to work
if not hasattr(self, '_proxies'):
self._proxies = self._dict()
if file_or_files is not None:
self._file_or_files = file_or_files
else:
if config_level is None:
if read_only:
self._file_or_files = [get_config_path(f) for f in CONFIG_LEVELS if f != 'repository']
else:
raise ValueError("No configuration level or configuration files specified")
else:
self._file_or_files = [get_config_path(config_level)]
self._read_only = read_only
self._dirty = False
self._is_initialized = False
self._merge_includes = merge_includes
self._repo = repo
self._lock = None
self._acquire_lock()
def _acquire_lock(self):
if not self._read_only:
if not self._lock:
if isinstance(self._file_or_files, (tuple, list)):
raise ValueError(
"Write-ConfigParsers can operate on a single file only, multiple files have been passed")
# END single file check
file_or_files = self._file_or_files
if not isinstance(self._file_or_files, str):
file_or_files = self._file_or_files.name
# END get filename from handle/stream
# initialize lock base - we want to write
self._lock = self.t_lock(file_or_files)
# END lock check
self._lock._obtain_lock()
# END read-only check
def __del__(self):
"""Write pending changes if required and release locks"""
# NOTE: only consistent in PY2
self.release()
def __enter__(self):
self._acquire_lock()
return self
def __exit__(self, exception_type, exception_value, traceback):
self.release()
def release(self):
"""Flush changes and release the configuration write lock. This instance must not be used anymore afterwards.
In Python 3, it's required to explicitly release locks and flush changes, as __del__ is not called
deterministically anymore."""
# checking for the lock here makes sure we do not raise during write()
# in case an invalid parser was created who could not get a lock
if self.read_only or (self._lock and not self._lock._has_lock()):
return
try:
try:
self.write()
except IOError:
log.error("Exception during destruction of GitConfigParser", exc_info=True)
except ReferenceError:
# This happens in PY3 ... and usually means that some state cannot be written
# as the sections dict cannot be iterated
# Usually when shutting down the interpreter, don'y know how to fix this
pass
finally:
self._lock._release_lock()
def optionxform(self, optionstr):
"""Do not transform options in any way when writing"""
return optionstr
def _read(self, fp, fpname):
"""A direct copy of the py2.4 version of the super class's _read method
to assure it uses ordered dicts. Had to change one line to make it work.
Future versions have this fixed, but in fact its quite embarrassing for the
guys not to have done it right in the first place !
Removed big comments to make it more compact.
Made sure it ignores initial whitespace as git uses tabs"""
cursect = None # None, or a dictionary
optname = None
lineno = 0
is_multi_line = False
e = None # None, or an exception
def string_decode(v):
if v[-1] == '\\':
v = v[:-1]
# end cut trailing escapes to prevent decode error
return v.encode(defenc).decode('unicode_escape')
# end
# end
while True:
# we assume to read binary !
line = fp.readline().decode(defenc)
if not line:
break
lineno = lineno + 1
# comment or blank line?
if line.strip() == '' or self.re_comment.match(line):
continue
if line.split(None, 1)[0].lower() == 'rem' and line[0] in "rR":
# no leading whitespace
continue
# is it a section header?
mo = self.SECTCRE.match(line.strip())
if not is_multi_line and mo:
sectname = mo.group('header').strip()
if sectname in self._sections:
cursect = self._sections[sectname]
elif sectname == cp.DEFAULTSECT:
cursect = self._defaults
else:
cursect = self._dict((('__name__', sectname),))
self._sections[sectname] = cursect
self._proxies[sectname] = None
# So sections can't start with a continuation line
optname = None
# no section header in the file?
elif cursect is None:
raise cp.MissingSectionHeaderError(fpname, lineno, line)
# an option line?
elif not is_multi_line:
mo = self.OPTCRE.match(line)
if mo:
# We might just have handled the last line, which could contain a quotation we want to remove
optname, vi, optval = mo.group('option', 'vi', 'value')
if vi in ('=', ':') and ';' in optval and not optval.strip().startswith('"'):
pos = optval.find(';')
if pos != -1 and optval[pos - 1].isspace():
optval = optval[:pos]
optval = optval.strip()
if optval == '""':
optval = ''
# end handle empty string
optname = self.optionxform(optname.rstrip())
if len(optval) > 1 and optval[0] == '"' and optval[-1] != '"':
is_multi_line = True
optval = string_decode(optval[1:])
# end handle multi-line
# preserves multiple values for duplicate optnames
cursect.add(optname, optval)
else:
# check if it's an option with no value - it's just ignored by git
if not self.OPTVALUEONLY.match(line):
if not e:
e = cp.ParsingError(fpname)
e.append(lineno, repr(line))
continue
else:
line = line.rstrip()
if line.endswith('"'):
is_multi_line = False
line = line[:-1]
# end handle quotations
optval = cursect.getlast(optname)
cursect.setlast(optname, optval + string_decode(line))
# END parse section or option
# END while reading
# if any parsing errors occurred, raise an exception
if e:
raise e
def _has_includes(self):
return self._merge_includes and len(self._included_paths())
def _included_paths(self):
"""Return all paths that must be included to configuration.
"""
paths = []
for section in self.sections():
if section == "include":
paths += self.items(section)
match = CONDITIONAL_INCLUDE_REGEXP.search(section)
if match is None or self._repo is None:
continue
keyword = match.group(1)
value = match.group(2).strip()
if keyword in ["gitdir", "gitdir/i"]:
value = osp.expanduser(value)
if not any(value.startswith(s) for s in ["./", "/"]):
value = "**/" + value
if value.endswith("/"):
value += "**"
# Ensure that glob is always case insensitive if required.
if keyword.endswith("/i"):
value = re.sub(
r"[a-zA-Z]",
lambda m: "[{}{}]".format(
m.group().lower(),
m.group().upper()
),
value
)
if fnmatch.fnmatchcase(self._repo.git_dir, value):
paths += self.items(section)
elif keyword == "onbranch":
try:
branch_name = self._repo.active_branch.name
except TypeError:
# Ignore section if active branch cannot be retrieved.
continue
if fnmatch.fnmatchcase(branch_name, value):
paths += self.items(section)
return paths
def read(self):
"""Reads the data stored in the files we have been initialized with. It will
ignore files that cannot be read, possibly leaving an empty configuration
:return: Nothing
:raise IOError: if a file cannot be handled"""
if self._is_initialized:
return
self._is_initialized = True
if not isinstance(self._file_or_files, (tuple, list)):
files_to_read = [self._file_or_files]
else:
files_to_read = list(self._file_or_files)
# end assure we have a copy of the paths to handle
seen = set(files_to_read)
num_read_include_files = 0
while files_to_read:
file_path = files_to_read.pop(0)
fp = file_path
file_ok = False
if hasattr(fp, "seek"):
self._read(fp, fp.name)
else:
# assume a path if it is not a file-object
try:
with open(file_path, 'rb') as fp:
file_ok = True
self._read(fp, fp.name)
except IOError:
continue
# Read includes and append those that we didn't handle yet
# We expect all paths to be normalized and absolute (and will assure that is the case)
if self._has_includes():
for _, include_path in self._included_paths():
if include_path.startswith('~'):
include_path = osp.expanduser(include_path)
if not osp.isabs(include_path):
if not file_ok:
continue
# end ignore relative paths if we don't know the configuration file path
assert osp.isabs(file_path), "Need absolute paths to be sure our cycle checks will work"
include_path = osp.join(osp.dirname(file_path), include_path)
# end make include path absolute
include_path = osp.normpath(include_path)
if include_path in seen or not os.access(include_path, os.R_OK):
continue
seen.add(include_path)
# insert included file to the top to be considered first
files_to_read.insert(0, include_path)
num_read_include_files += 1
# each include path in configuration file
# end handle includes
# END for each file object to read
# If there was no file included, we can safely write back (potentially) the configuration file
# without altering it's meaning
if num_read_include_files == 0:
self._merge_includes = False
# end
def _write(self, fp):
"""Write an .ini-format representation of the configuration state in
git compatible format"""
def write_section(name, section_dict):
fp.write(("[%s]\n" % name).encode(defenc))
for (key, values) in section_dict.items_all():
if key == "__name__":
continue
for v in values:
fp.write(("\t%s = %s\n" % (key, self._value_to_string(v).replace('\n', '\n\t'))).encode(defenc))
# END if key is not __name__
# END section writing
if self._defaults:
write_section(cp.DEFAULTSECT, self._defaults)
for name, value in self._sections.items():
write_section(name, value)
def items(self, section_name):
""":return: list((option, value), ...) pairs of all items in the given section"""
return [(k, v) for k, v in super(GitConfigParser, self).items(section_name) if k != '__name__']
def items_all(self, section_name):
""":return: list((option, [values...]), ...) pairs of all items in the given section"""
rv = _OMD(self._defaults)
for k, vs in self._sections[section_name].items_all():
if k == '__name__':
continue
if k in rv and rv.getall(k) == vs:
continue
for v in vs:
rv.add(k, v)
return rv.items_all()
@needs_values
def write(self):
"""Write changes to our file, if there are changes at all
:raise IOError: if this is a read-only writer instance or if we could not obtain
a file lock"""
self._assure_writable("write")
if not self._dirty:
return
if isinstance(self._file_or_files, (list, tuple)):
raise AssertionError("Cannot write back if there is not exactly a single file to write to, have %i files"
% len(self._file_or_files))
# end assert multiple files
if self._has_includes():
log.debug("Skipping write-back of configuration file as include files were merged in." +
"Set merge_includes=False to prevent this.")
return
# end
fp = self._file_or_files
# we have a physical file on disk, so get a lock
is_file_lock = isinstance(fp, (str, IOBase))
if is_file_lock:
self._lock._obtain_lock()
if not hasattr(fp, "seek"):
with open(self._file_or_files, "wb") as fp:
self._write(fp)
else:
fp.seek(0)
# make sure we do not overwrite into an existing file
if hasattr(fp, 'truncate'):
fp.truncate()
self._write(fp)
def _assure_writable(self, method_name):
if self.read_only:
raise IOError("Cannot execute non-constant method %s.%s" % (self, method_name))
def add_section(self, section):
"""Assures added options will stay in order"""
return super(GitConfigParser, self).add_section(section)
@property
def read_only(self):
""":return: True if this instance may change the configuration file"""
return self._read_only
def get_value(self, section, option, default=None):
"""Get an option's value.
If multiple values are specified for this option in the section, the
last one specified is returned.
:param default:
If not None, the given default value will be returned in case
the option did not exist
:return: a properly typed value, either int, float or string
:raise TypeError: in case the value could not be understood
Otherwise the exceptions known to the ConfigParser will be raised."""
try:
valuestr = self.get(section, option)
except Exception:
if default is not None:
return default
raise
return self._string_to_value(valuestr)
def get_values(self, section, option, default=None):
"""Get an option's values.
If multiple values are specified for this option in the section, all are
returned.
:param default:
If not None, a list containing the given default value will be
returned in case the option did not exist
:return: a list of properly typed values, either int, float or string
:raise TypeError: in case the value could not be understood
Otherwise the exceptions known to the ConfigParser will be raised."""
try:
lst = self._sections[section].getall(option)
except Exception:
if default is not None:
return [default]
raise
return [self._string_to_value(valuestr) for valuestr in lst]
def _string_to_value(self, valuestr):
types = (int, float)
for numtype in types:
try:
val = numtype(valuestr)
# truncated value ?
if val != float(valuestr):
continue
return val
except (ValueError, TypeError):
continue
# END for each numeric type
# try boolean values as git uses them
vl = valuestr.lower()
if vl == 'false':
return False
if vl == 'true':
return True
if not isinstance(valuestr, str):
raise TypeError(
"Invalid value type: only int, long, float and str are allowed",
valuestr)
return valuestr
def _value_to_string(self, value):
if isinstance(value, (int, float, bool)):
return str(value)
return force_text(value)
@needs_values
@set_dirty_and_flush_changes
def set_value(self, section, option, value):
"""Sets the given option in section to the given value.
It will create the section if required, and will not throw as opposed to the default
ConfigParser 'set' method.
:param section: Name of the section in which the option resides or should reside
:param option: Name of the options whose value to set
:param value: Value to set the option to. It must be a string or convertible
to a string
:return: this instance"""
if not self.has_section(section):
self.add_section(section)
self.set(section, option, self._value_to_string(value))
return self
@needs_values
@set_dirty_and_flush_changes
def add_value(self, section, option, value):
"""Adds a value for the given option in section.
It will create the section if required, and will not throw as opposed to the default
ConfigParser 'set' method. The value becomes the new value of the option as returned
by 'get_value', and appends to the list of values returned by 'get_values`'.
:param section: Name of the section in which the option resides or should reside
:param option: Name of the option
:param value: Value to add to option. It must be a string or convertible
to a string
:return: this instance"""
if not self.has_section(section):
self.add_section(section)
self._sections[section].add(option, self._value_to_string(value))
return self
def rename_section(self, section, new_name):
"""rename the given section to new_name
:raise ValueError: if section doesn't exit
:raise ValueError: if a section with new_name does already exist
:return: this instance
"""
if not self.has_section(section):
raise ValueError("Source section '%s' doesn't exist" % section)
if self.has_section(new_name):
raise ValueError("Destination section '%s' already exists" % new_name)
super(GitConfigParser, self).add_section(new_name)
new_section = self._sections[new_name]
for k, vs in self.items_all(section):
new_section.setall(k, vs)
# end for each value to copy
# This call writes back the changes, which is why we don't have the respective decorator
self.remove_section(section)
return self

View File

@ -1,60 +0,0 @@
"""Module with our own gitdb implementation - it uses the git command"""
from git.util import bin_to_hex, hex_to_bin
from gitdb.base import (
OInfo,
OStream
)
from gitdb.db import GitDB # @UnusedImport
from gitdb.db import LooseObjectDB
from .exc import (
GitCommandError,
BadObject
)
__all__ = ('GitCmdObjectDB', 'GitDB')
# class GitCmdObjectDB(CompoundDB, ObjectDBW):
class GitCmdObjectDB(LooseObjectDB):
"""A database representing the default git object store, which includes loose
objects, pack files and an alternates file
It will create objects only in the loose object database.
:note: for now, we use the git command to do all the lookup, just until he
have packs and the other implementations
"""
def __init__(self, root_path, git):
"""Initialize this instance with the root and a git command"""
super(GitCmdObjectDB, self).__init__(root_path)
self._git = git
def info(self, sha):
hexsha, typename, size = self._git.get_object_header(bin_to_hex(sha))
return OInfo(hex_to_bin(hexsha), typename, size)
def stream(self, sha):
"""For now, all lookup is done by git itself"""
hexsha, typename, size, stream = self._git.stream_object_data(bin_to_hex(sha))
return OStream(hex_to_bin(hexsha), typename, size, stream)
# { Interface
def partial_to_complete_sha_hex(self, partial_hexsha):
""":return: Full binary 20 byte sha from the given partial hexsha
:raise AmbiguousObjectName:
:raise BadObject:
:note: currently we only raise BadObject as git does not communicate
AmbiguousObjects separately"""
try:
hexsha, _typename, _size = self._git.get_object_header(partial_hexsha)
return hex_to_bin(hexsha)
except (GitCommandError, ValueError) as e:
raise BadObject(partial_hexsha) from e
# END handle exceptions
#} END interface

View File

@ -1,534 +0,0 @@
# diff.py
# Copyright (C) 2008, 2009 Michael Trier (mtrier@gmail.com) and contributors
#
# This module is part of GitPython and is released under
# the BSD License: http://www.opensource.org/licenses/bsd-license.php
import re
from git.cmd import handle_process_output
from git.compat import defenc
from git.util import finalize_process, hex_to_bin
from .objects.blob import Blob
from .objects.util import mode_str_to_int
__all__ = ('Diffable', 'DiffIndex', 'Diff', 'NULL_TREE')
# Special object to compare against the empty tree in diffs
NULL_TREE = object()
_octal_byte_re = re.compile(b'\\\\([0-9]{3})')
def _octal_repl(matchobj):
value = matchobj.group(1)
value = int(value, 8)
value = bytes(bytearray((value,)))
return value
def decode_path(path, has_ab_prefix=True):
if path == b'/dev/null':
return None
if path.startswith(b'"') and path.endswith(b'"'):
path = (path[1:-1].replace(b'\\n', b'\n')
.replace(b'\\t', b'\t')
.replace(b'\\"', b'"')
.replace(b'\\\\', b'\\'))
path = _octal_byte_re.sub(_octal_repl, path)
if has_ab_prefix:
assert path.startswith(b'a/') or path.startswith(b'b/')
path = path[2:]
return path
class Diffable(object):
"""Common interface for all object that can be diffed against another object of compatible type.
:note:
Subclasses require a repo member as it is the case for Object instances, for practical
reasons we do not derive from Object."""
__slots__ = ()
# standin indicating you want to diff against the index
class Index(object):
pass
def _process_diff_args(self, args):
"""
:return:
possibly altered version of the given args list.
Method is called right before git command execution.
Subclasses can use it to alter the behaviour of the superclass"""
return args
def diff(self, other=Index, paths=None, create_patch=False, **kwargs):
"""Creates diffs between two items being trees, trees and index or an
index and the working tree. It will detect renames automatically.
:param other:
Is the item to compare us with.
If None, we will be compared to the working tree.
If Treeish, it will be compared against the respective tree
If Index ( type ), it will be compared against the index.
If git.NULL_TREE, it will compare against the empty tree.
It defaults to Index to assure the method will not by-default fail
on bare repositories.
:param paths:
is a list of paths or a single path to limit the diff to.
It will only include at least one of the given path or paths.
:param create_patch:
If True, the returned Diff contains a detailed patch that if applied
makes the self to other. Patches are somewhat costly as blobs have to be read
and diffed.
:param kwargs:
Additional arguments passed to git-diff, such as
R=True to swap both sides of the diff.
:return: git.DiffIndex
:note:
On a bare repository, 'other' needs to be provided as Index or as
as Tree/Commit, or a git command error will occur"""
args = []
args.append("--abbrev=40") # we need full shas
args.append("--full-index") # get full index paths, not only filenames
args.append("-M") # check for renames, in both formats
if create_patch:
args.append("-p")
else:
args.append("--raw")
args.append("-z")
# in any way, assure we don't see colored output,
# fixes https://github.com/gitpython-developers/GitPython/issues/172
args.append('--no-color')
if paths is not None and not isinstance(paths, (tuple, list)):
paths = [paths]
diff_cmd = self.repo.git.diff
if other is self.Index:
args.insert(0, '--cached')
elif other is NULL_TREE:
args.insert(0, '-r') # recursive diff-tree
args.insert(0, '--root')
diff_cmd = self.repo.git.diff_tree
elif other is not None:
args.insert(0, '-r') # recursive diff-tree
args.insert(0, other)
diff_cmd = self.repo.git.diff_tree
args.insert(0, self)
# paths is list here or None
if paths:
args.append("--")
args.extend(paths)
# END paths handling
kwargs['as_process'] = True
proc = diff_cmd(*self._process_diff_args(args), **kwargs)
diff_method = (Diff._index_from_patch_format
if create_patch
else Diff._index_from_raw_format)
index = diff_method(self.repo, proc)
proc.wait()
return index
class DiffIndex(list):
"""Implements an Index for diffs, allowing a list of Diffs to be queried by
the diff properties.
The class improves the diff handling convenience"""
# change type invariant identifying possible ways a blob can have changed
# A = Added
# D = Deleted
# R = Renamed
# M = Modified
# T = Changed in the type
change_type = ("A", "C", "D", "R", "M", "T")
def iter_change_type(self, change_type):
"""
:return:
iterator yielding Diff instances that match the given change_type
:param change_type:
Member of DiffIndex.change_type, namely:
* 'A' for added paths
* 'D' for deleted paths
* 'R' for renamed paths
* 'M' for paths with modified data
* 'T' for changed in the type paths
"""
if change_type not in self.change_type:
raise ValueError("Invalid change type: %s" % change_type)
for diff in self:
if diff.change_type == change_type:
yield diff
elif change_type == "A" and diff.new_file:
yield diff
elif change_type == "D" and diff.deleted_file:
yield diff
elif change_type == "C" and diff.copied_file:
yield diff
elif change_type == "R" and diff.renamed:
yield diff
elif change_type == "M" and diff.a_blob and diff.b_blob and diff.a_blob != diff.b_blob:
yield diff
# END for each diff
class Diff(object):
"""A Diff contains diff information between two Trees.
It contains two sides a and b of the diff, members are prefixed with
"a" and "b" respectively to inidcate that.
Diffs keep information about the changed blob objects, the file mode, renames,
deletions and new files.
There are a few cases where None has to be expected as member variable value:
``New File``::
a_mode is None
a_blob is None
a_path is None
``Deleted File``::
b_mode is None
b_blob is None
b_path is None
``Working Tree Blobs``
When comparing to working trees, the working tree blob will have a null hexsha
as a corresponding object does not yet exist. The mode will be null as well.
But the path will be available though.
If it is listed in a diff the working tree version of the file must
be different to the version in the index or tree, and hence has been modified."""
# precompiled regex
re_header = re.compile(br"""
^diff[ ]--git
[ ](?P<a_path_fallback>"?[ab]/.+?"?)[ ](?P<b_path_fallback>"?[ab]/.+?"?)\n
(?:^old[ ]mode[ ](?P<old_mode>\d+)\n
^new[ ]mode[ ](?P<new_mode>\d+)(?:\n|$))?
(?:^similarity[ ]index[ ]\d+%\n
^rename[ ]from[ ](?P<rename_from>.*)\n
^rename[ ]to[ ](?P<rename_to>.*)(?:\n|$))?
(?:^new[ ]file[ ]mode[ ](?P<new_file_mode>.+)(?:\n|$))?
(?:^deleted[ ]file[ ]mode[ ](?P<deleted_file_mode>.+)(?:\n|$))?
(?:^similarity[ ]index[ ]\d+%\n
^copy[ ]from[ ].*\n
^copy[ ]to[ ](?P<copied_file_name>.*)(?:\n|$))?
(?:^index[ ](?P<a_blob_id>[0-9A-Fa-f]+)
\.\.(?P<b_blob_id>[0-9A-Fa-f]+)[ ]?(?P<b_mode>.+)?(?:\n|$))?
(?:^---[ ](?P<a_path>[^\t\n\r\f\v]*)[\t\r\f\v]*(?:\n|$))?
(?:^\+\+\+[ ](?P<b_path>[^\t\n\r\f\v]*)[\t\r\f\v]*(?:\n|$))?
""", re.VERBOSE | re.MULTILINE)
# can be used for comparisons
NULL_HEX_SHA = "0" * 40
NULL_BIN_SHA = b"\0" * 20
__slots__ = ("a_blob", "b_blob", "a_mode", "b_mode", "a_rawpath", "b_rawpath",
"new_file", "deleted_file", "copied_file", "raw_rename_from",
"raw_rename_to", "diff", "change_type", "score")
def __init__(self, repo, a_rawpath, b_rawpath, a_blob_id, b_blob_id, a_mode,
b_mode, new_file, deleted_file, copied_file, raw_rename_from,
raw_rename_to, diff, change_type, score):
self.a_mode = a_mode
self.b_mode = b_mode
assert a_rawpath is None or isinstance(a_rawpath, bytes)
assert b_rawpath is None or isinstance(b_rawpath, bytes)
self.a_rawpath = a_rawpath
self.b_rawpath = b_rawpath
if self.a_mode:
self.a_mode = mode_str_to_int(self.a_mode)
if self.b_mode:
self.b_mode = mode_str_to_int(self.b_mode)
# Determine whether this diff references a submodule, if it does then
# we need to overwrite "repo" to the corresponding submodule's repo instead
if repo and a_rawpath:
for submodule in repo.submodules:
if submodule.path == a_rawpath.decode(defenc, 'replace'):
if submodule.module_exists():
repo = submodule.module()
break
if a_blob_id is None or a_blob_id == self.NULL_HEX_SHA:
self.a_blob = None
else:
self.a_blob = Blob(repo, hex_to_bin(a_blob_id), mode=self.a_mode, path=self.a_path)
if b_blob_id is None or b_blob_id == self.NULL_HEX_SHA:
self.b_blob = None
else:
self.b_blob = Blob(repo, hex_to_bin(b_blob_id), mode=self.b_mode, path=self.b_path)
self.new_file = new_file
self.deleted_file = deleted_file
self.copied_file = copied_file
# be clear and use None instead of empty strings
assert raw_rename_from is None or isinstance(raw_rename_from, bytes)
assert raw_rename_to is None or isinstance(raw_rename_to, bytes)
self.raw_rename_from = raw_rename_from or None
self.raw_rename_to = raw_rename_to or None
self.diff = diff
self.change_type = change_type
self.score = score
def __eq__(self, other):
for name in self.__slots__:
if getattr(self, name) != getattr(other, name):
return False
# END for each name
return True
def __ne__(self, other):
return not (self == other)
def __hash__(self):
return hash(tuple(getattr(self, n) for n in self.__slots__))
def __str__(self):
h = "%s"
if self.a_blob:
h %= self.a_blob.path
elif self.b_blob:
h %= self.b_blob.path
msg = ''
line = None # temp line
line_length = 0 # line length
for b, n in zip((self.a_blob, self.b_blob), ('lhs', 'rhs')):
if b:
line = "\n%s: %o | %s" % (n, b.mode, b.hexsha)
else:
line = "\n%s: None" % n
# END if blob is not None
line_length = max(len(line), line_length)
msg += line
# END for each blob
# add headline
h += '\n' + '=' * line_length
if self.deleted_file:
msg += '\nfile deleted in rhs'
if self.new_file:
msg += '\nfile added in rhs'
if self.copied_file:
msg += '\nfile %r copied from %r' % (self.b_path, self.a_path)
if self.rename_from:
msg += '\nfile renamed from %r' % self.rename_from
if self.rename_to:
msg += '\nfile renamed to %r' % self.rename_to
if self.diff:
msg += '\n---'
try:
msg += self.diff.decode(defenc)
except UnicodeDecodeError:
msg += 'OMITTED BINARY DATA'
# end handle encoding
msg += '\n---'
# END diff info
# Python2 silliness: have to assure we convert our likely to be unicode object to a string with the
# right encoding. Otherwise it tries to convert it using ascii, which may fail ungracefully
res = h + msg
# end
return res
@property
def a_path(self):
return self.a_rawpath.decode(defenc, 'replace') if self.a_rawpath else None
@property
def b_path(self):
return self.b_rawpath.decode(defenc, 'replace') if self.b_rawpath else None
@property
def rename_from(self):
return self.raw_rename_from.decode(defenc, 'replace') if self.raw_rename_from else None
@property
def rename_to(self):
return self.raw_rename_to.decode(defenc, 'replace') if self.raw_rename_to else None
@property
def renamed(self):
""":returns: True if the blob of our diff has been renamed
:note: This property is deprecated, please use ``renamed_file`` instead.
"""
return self.renamed_file
@property
def renamed_file(self):
""":returns: True if the blob of our diff has been renamed
"""
return self.rename_from != self.rename_to
@classmethod
def _pick_best_path(cls, path_match, rename_match, path_fallback_match):
if path_match:
return decode_path(path_match)
if rename_match:
return decode_path(rename_match, has_ab_prefix=False)
if path_fallback_match:
return decode_path(path_fallback_match)
return None
@classmethod
def _index_from_patch_format(cls, repo, proc):
"""Create a new DiffIndex from the given text which must be in patch format
:param repo: is the repository we are operating on - it is required
:param stream: result of 'git diff' as a stream (supporting file protocol)
:return: git.DiffIndex """
## FIXME: Here SLURPING raw, need to re-phrase header-regexes linewise.
text = []
handle_process_output(proc, text.append, None, finalize_process, decode_streams=False)
# for now, we have to bake the stream
text = b''.join(text)
index = DiffIndex()
previous_header = None
header = None
for _header in cls.re_header.finditer(text):
a_path_fallback, b_path_fallback, \
old_mode, new_mode, \
rename_from, rename_to, \
new_file_mode, deleted_file_mode, copied_file_name, \
a_blob_id, b_blob_id, b_mode, \
a_path, b_path = _header.groups()
new_file, deleted_file, copied_file = \
bool(new_file_mode), bool(deleted_file_mode), bool(copied_file_name)
a_path = cls._pick_best_path(a_path, rename_from, a_path_fallback)
b_path = cls._pick_best_path(b_path, rename_to, b_path_fallback)
# Our only means to find the actual text is to see what has not been matched by our regex,
# and then retro-actively assign it to our index
if previous_header is not None:
index[-1].diff = text[previous_header.end():_header.start()]
# end assign actual diff
# Make sure the mode is set if the path is set. Otherwise the resulting blob is invalid
# We just use the one mode we should have parsed
a_mode = old_mode or deleted_file_mode or (a_path and (b_mode or new_mode or new_file_mode))
b_mode = b_mode or new_mode or new_file_mode or (b_path and a_mode)
index.append(Diff(repo,
a_path,
b_path,
a_blob_id and a_blob_id.decode(defenc),
b_blob_id and b_blob_id.decode(defenc),
a_mode and a_mode.decode(defenc),
b_mode and b_mode.decode(defenc),
new_file, deleted_file, copied_file,
rename_from,
rename_to,
None, None, None))
previous_header = _header
header = _header
# end for each header we parse
if index:
index[-1].diff = text[header.end():]
# end assign last diff
return index
@classmethod
def _index_from_raw_format(cls, repo, proc):
"""Create a new DiffIndex from the given stream which must be in raw format.
:return: git.DiffIndex"""
# handles
# :100644 100644 687099101... 37c5e30c8... M .gitignore
index = DiffIndex()
def handle_diff_line(lines):
lines = lines.decode(defenc)
for line in lines.split(':')[1:]:
meta, _, path = line.partition('\x00')
path = path.rstrip('\x00')
old_mode, new_mode, a_blob_id, b_blob_id, _change_type = meta.split(None, 4)
# Change type can be R100
# R: status letter
# 100: score (in case of copy and rename)
change_type = _change_type[0]
score_str = ''.join(_change_type[1:])
score = int(score_str) if score_str.isdigit() else None
path = path.strip()
a_path = path.encode(defenc)
b_path = path.encode(defenc)
deleted_file = False
new_file = False
copied_file = False
rename_from = None
rename_to = None
# NOTE: We cannot conclude from the existence of a blob to change type
# as diffs with the working do not have blobs yet
if change_type == 'D':
b_blob_id = None
deleted_file = True
elif change_type == 'A':
a_blob_id = None
new_file = True
elif change_type == 'C':
copied_file = True
a_path, b_path = path.split('\x00', 1)
a_path = a_path.encode(defenc)
b_path = b_path.encode(defenc)
elif change_type == 'R':
a_path, b_path = path.split('\x00', 1)
a_path = a_path.encode(defenc)
b_path = b_path.encode(defenc)
rename_from, rename_to = a_path, b_path
elif change_type == 'T':
# Nothing to do
pass
# END add/remove handling
diff = Diff(repo, a_path, b_path, a_blob_id, b_blob_id, old_mode, new_mode,
new_file, deleted_file, copied_file, rename_from, rename_to,
'', change_type, score)
index.append(diff)
handle_process_output(proc, handle_diff_line, None, finalize_process, decode_streams=False)
return index

View File

@ -1,132 +0,0 @@
# exc.py
# Copyright (C) 2008, 2009 Michael Trier (mtrier@gmail.com) and contributors
#
# This module is part of GitPython and is released under
# the BSD License: http://www.opensource.org/licenses/bsd-license.php
""" Module containing all exceptions thrown throughout the git package, """
from gitdb.exc import * # NOQA @UnusedWildImport skipcq: PYL-W0401, PYL-W0614
from git.compat import safe_decode
class GitError(Exception):
""" Base class for all package exceptions """
class InvalidGitRepositoryError(GitError):
""" Thrown if the given repository appears to have an invalid format. """
class WorkTreeRepositoryUnsupported(InvalidGitRepositoryError):
""" Thrown to indicate we can't handle work tree repositories """
class NoSuchPathError(GitError, OSError):
""" Thrown if a path could not be access by the system. """
class CommandError(GitError):
"""Base class for exceptions thrown at every stage of `Popen()` execution.
:param command:
A non-empty list of argv comprising the command-line.
"""
#: A unicode print-format with 2 `%s for `<cmdline>` and the rest,
#: e.g.
#: "'%s' failed%s"
_msg = "Cmd('%s') failed%s"
def __init__(self, command, status=None, stderr=None, stdout=None):
if not isinstance(command, (tuple, list)):
command = command.split()
self.command = command
self.status = status
if status:
if isinstance(status, Exception):
status = "%s('%s')" % (type(status).__name__, safe_decode(str(status)))
else:
try:
status = 'exit code(%s)' % int(status)
except (ValueError, TypeError):
s = safe_decode(str(status))
status = "'%s'" % s if isinstance(status, str) else s
self._cmd = safe_decode(command[0])
self._cmdline = ' '.join(safe_decode(i) for i in command)
self._cause = status and " due to: %s" % status or "!"
self.stdout = stdout and "\n stdout: '%s'" % safe_decode(stdout) or ''
self.stderr = stderr and "\n stderr: '%s'" % safe_decode(stderr) or ''
def __str__(self):
return (self._msg + "\n cmdline: %s%s%s") % (
self._cmd, self._cause, self._cmdline, self.stdout, self.stderr)
class GitCommandNotFound(CommandError):
"""Thrown if we cannot find the `git` executable in the PATH or at the path given by
the GIT_PYTHON_GIT_EXECUTABLE environment variable"""
def __init__(self, command, cause):
super(GitCommandNotFound, self).__init__(command, cause)
self._msg = "Cmd('%s') not found%s"
class GitCommandError(CommandError):
""" Thrown if execution of the git command fails with non-zero status code. """
def __init__(self, command, status, stderr=None, stdout=None):
super(GitCommandError, self).__init__(command, status, stderr, stdout)
class CheckoutError(GitError):
"""Thrown if a file could not be checked out from the index as it contained
changes.
The .failed_files attribute contains a list of relative paths that failed
to be checked out as they contained changes that did not exist in the index.
The .failed_reasons attribute contains a string informing about the actual
cause of the issue.
The .valid_files attribute contains a list of relative paths to files that
were checked out successfully and hence match the version stored in the
index"""
def __init__(self, message, failed_files, valid_files, failed_reasons):
Exception.__init__(self, message)
self.failed_files = failed_files
self.failed_reasons = failed_reasons
self.valid_files = valid_files
def __str__(self):
return Exception.__str__(self) + ":%s" % self.failed_files
class CacheError(GitError):
"""Base for all errors related to the git index, which is called cache internally"""
class UnmergedEntriesError(CacheError):
"""Thrown if an operation cannot proceed as there are still unmerged
entries in the cache"""
class HookExecutionError(CommandError):
"""Thrown if a hook exits with a non-zero exit code. It provides access to the exit code and the string returned
via standard output"""
def __init__(self, command, status, stderr=None, stdout=None):
super(HookExecutionError, self).__init__(command, status, stderr, stdout)
self._msg = "Hook('%s') failed%s"
class RepositoryDirtyError(GitError):
"""Thrown whenever an operation on a repository fails as it has uncommitted changes that would be overwritten"""
def __init__(self, repo, message):
self.repo = repo
self.message = message
def __str__(self):
return "Operation cannot be performed on %r: %s" % (self.repo, self.message)

View File

@ -1,6 +0,0 @@
"""Initialize the index package"""
# flake8: noqa
from __future__ import absolute_import
from .base import *
from .typ import *

View File

@ -1,381 +0,0 @@
# Contains standalone functions to accompany the index implementation and make it
# more versatile
# NOTE: Autodoc hates it if this is a docstring
from io import BytesIO
import os
from stat import (
S_IFDIR,
S_IFLNK,
S_ISLNK,
S_ISDIR,
S_IFMT,
S_IFREG,
)
import subprocess
from git.cmd import PROC_CREATIONFLAGS, handle_process_output
from git.compat import (
defenc,
force_text,
force_bytes,
is_posix,
safe_decode,
)
from git.exc import (
UnmergedEntriesError,
HookExecutionError
)
from git.objects.fun import (
tree_to_stream,
traverse_tree_recursive,
traverse_trees_recursive
)
from git.util import IndexFileSHA1Writer, finalize_process
from gitdb.base import IStream
from gitdb.typ import str_tree_type
import os.path as osp
from .typ import (
BaseIndexEntry,
IndexEntry,
CE_NAMEMASK,
CE_STAGESHIFT
)
from .util import (
pack,
unpack
)
S_IFGITLINK = S_IFLNK | S_IFDIR # a submodule
CE_NAMEMASK_INV = ~CE_NAMEMASK
__all__ = ('write_cache', 'read_cache', 'write_tree_from_cache', 'entry_key',
'stat_mode_to_index_mode', 'S_IFGITLINK', 'run_commit_hook', 'hook_path')
def hook_path(name, git_dir):
""":return: path to the given named hook in the given git repository directory"""
return osp.join(git_dir, 'hooks', name)
def run_commit_hook(name, index, *args):
"""Run the commit hook of the given name. Silently ignores hooks that do not exist.
:param name: name of hook, like 'pre-commit'
:param index: IndexFile instance
:param args: arguments passed to hook file
:raises HookExecutionError: """
hp = hook_path(name, index.repo.git_dir)
if not os.access(hp, os.X_OK):
return
env = os.environ.copy()
env['GIT_INDEX_FILE'] = safe_decode(index.path)
env['GIT_EDITOR'] = ':'
try:
cmd = subprocess.Popen([hp] + list(args),
env=env,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
cwd=index.repo.working_dir,
close_fds=is_posix,
creationflags=PROC_CREATIONFLAGS,)
except Exception as ex:
raise HookExecutionError(hp, ex) from ex
else:
stdout = []
stderr = []
handle_process_output(cmd, stdout.append, stderr.append, finalize_process)
stdout = ''.join(stdout)
stderr = ''.join(stderr)
if cmd.returncode != 0:
stdout = force_text(stdout, defenc)
stderr = force_text(stderr, defenc)
raise HookExecutionError(hp, cmd.returncode, stderr, stdout)
# end handle return code
def stat_mode_to_index_mode(mode):
"""Convert the given mode from a stat call to the corresponding index mode
and return it"""
if S_ISLNK(mode): # symlinks
return S_IFLNK
if S_ISDIR(mode) or S_IFMT(mode) == S_IFGITLINK: # submodules
return S_IFGITLINK
return S_IFREG | 0o644 | (mode & 0o111) # blobs with or without executable bit
def write_cache(entries, stream, extension_data=None, ShaStreamCls=IndexFileSHA1Writer):
"""Write the cache represented by entries to a stream
:param entries: **sorted** list of entries
:param stream: stream to wrap into the AdapterStreamCls - it is used for
final output.
:param ShaStreamCls: Type to use when writing to the stream. It produces a sha
while writing to it, before the data is passed on to the wrapped stream
:param extension_data: any kind of data to write as a trailer, it must begin
a 4 byte identifier, followed by its size ( 4 bytes )"""
# wrap the stream into a compatible writer
stream = ShaStreamCls(stream)
tell = stream.tell
write = stream.write
# header
version = 2
write(b"DIRC")
write(pack(">LL", version, len(entries)))
# body
for entry in entries:
beginoffset = tell()
write(entry[4]) # ctime
write(entry[5]) # mtime
path = entry[3]
path = force_bytes(path, encoding=defenc)
plen = len(path) & CE_NAMEMASK # path length
assert plen == len(path), "Path %s too long to fit into index" % entry[3]
flags = plen | (entry[2] & CE_NAMEMASK_INV) # clear possible previous values
write(pack(">LLLLLL20sH", entry[6], entry[7], entry[0],
entry[8], entry[9], entry[10], entry[1], flags))
write(path)
real_size = ((tell() - beginoffset + 8) & ~7)
write(b"\0" * ((beginoffset + real_size) - tell()))
# END for each entry
# write previously cached extensions data
if extension_data is not None:
stream.write(extension_data)
# write the sha over the content
stream.write_sha()
def read_header(stream):
"""Return tuple(version_long, num_entries) from the given stream"""
type_id = stream.read(4)
if type_id != b"DIRC":
raise AssertionError("Invalid index file header: %r" % type_id)
version, num_entries = unpack(">LL", stream.read(4 * 2))
# TODO: handle version 3: extended data, see read-cache.c
assert version in (1, 2)
return version, num_entries
def entry_key(*entry):
""":return: Key suitable to be used for the index.entries dictionary
:param entry: One instance of type BaseIndexEntry or the path and the stage"""
if len(entry) == 1:
return (entry[0].path, entry[0].stage)
return tuple(entry)
# END handle entry
def read_cache(stream):
"""Read a cache file from the given stream
:return: tuple(version, entries_dict, extension_data, content_sha)
* version is the integer version number
* entries dict is a dictionary which maps IndexEntry instances to a path at a stage
* extension_data is '' or 4 bytes of type + 4 bytes of size + size bytes
* content_sha is a 20 byte sha on all cache file contents"""
version, num_entries = read_header(stream)
count = 0
entries = {}
read = stream.read
tell = stream.tell
while count < num_entries:
beginoffset = tell()
ctime = unpack(">8s", read(8))[0]
mtime = unpack(">8s", read(8))[0]
(dev, ino, mode, uid, gid, size, sha, flags) = \
unpack(">LLLLLL20sH", read(20 + 4 * 6 + 2))
path_size = flags & CE_NAMEMASK
path = read(path_size).decode(defenc)
real_size = ((tell() - beginoffset + 8) & ~7)
read((beginoffset + real_size) - tell())
entry = IndexEntry((mode, sha, flags, path, ctime, mtime, dev, ino, uid, gid, size))
# entry_key would be the method to use, but we safe the effort
entries[(path, entry.stage)] = entry
count += 1
# END for each entry
# the footer contains extension data and a sha on the content so far
# Keep the extension footer,and verify we have a sha in the end
# Extension data format is:
# 4 bytes ID
# 4 bytes length of chunk
# repeated 0 - N times
extension_data = stream.read(~0)
assert len(extension_data) > 19, "Index Footer was not at least a sha on content as it was only %i bytes in size"\
% len(extension_data)
content_sha = extension_data[-20:]
# truncate the sha in the end as we will dynamically create it anyway
extension_data = extension_data[:-20]
return (version, entries, extension_data, content_sha)
def write_tree_from_cache(entries, odb, sl, si=0):
"""Create a tree from the given sorted list of entries and put the respective
trees into the given object database
:param entries: **sorted** list of IndexEntries
:param odb: object database to store the trees in
:param si: start index at which we should start creating subtrees
:param sl: slice indicating the range we should process on the entries list
:return: tuple(binsha, list(tree_entry, ...)) a tuple of a sha and a list of
tree entries being a tuple of hexsha, mode, name"""
tree_items = []
tree_items_append = tree_items.append
ci = sl.start
end = sl.stop
while ci < end:
entry = entries[ci]
if entry.stage != 0:
raise UnmergedEntriesError(entry)
# END abort on unmerged
ci += 1
rbound = entry.path.find('/', si)
if rbound == -1:
# its not a tree
tree_items_append((entry.binsha, entry.mode, entry.path[si:]))
else:
# find common base range
base = entry.path[si:rbound]
xi = ci
while xi < end:
oentry = entries[xi]
orbound = oentry.path.find('/', si)
if orbound == -1 or oentry.path[si:orbound] != base:
break
# END abort on base mismatch
xi += 1
# END find common base
# enter recursion
# ci - 1 as we want to count our current item as well
sha, _tree_entry_list = write_tree_from_cache(entries, odb, slice(ci - 1, xi), rbound + 1)
tree_items_append((sha, S_IFDIR, base))
# skip ahead
ci = xi
# END handle bounds
# END for each entry
# finally create the tree
sio = BytesIO()
tree_to_stream(tree_items, sio.write)
sio.seek(0)
istream = odb.store(IStream(str_tree_type, len(sio.getvalue()), sio))
return (istream.binsha, tree_items)
def _tree_entry_to_baseindexentry(tree_entry, stage):
return BaseIndexEntry((tree_entry[1], tree_entry[0], stage << CE_STAGESHIFT, tree_entry[2]))
def aggressive_tree_merge(odb, tree_shas):
"""
:return: list of BaseIndexEntries representing the aggressive merge of the given
trees. All valid entries are on stage 0, whereas the conflicting ones are left
on stage 1, 2 or 3, whereas stage 1 corresponds to the common ancestor tree,
2 to our tree and 3 to 'their' tree.
:param tree_shas: 1, 2 or 3 trees as identified by their binary 20 byte shas
If 1 or two, the entries will effectively correspond to the last given tree
If 3 are given, a 3 way merge is performed"""
out = []
out_append = out.append
# one and two way is the same for us, as we don't have to handle an existing
# index, instrea
if len(tree_shas) in (1, 2):
for entry in traverse_tree_recursive(odb, tree_shas[-1], ''):
out_append(_tree_entry_to_baseindexentry(entry, 0))
# END for each entry
return out
# END handle single tree
if len(tree_shas) > 3:
raise ValueError("Cannot handle %i trees at once" % len(tree_shas))
# three trees
for base, ours, theirs in traverse_trees_recursive(odb, tree_shas, ''):
if base is not None:
# base version exists
if ours is not None:
# ours exists
if theirs is not None:
# it exists in all branches, if it was changed in both
# its a conflict, otherwise we take the changed version
# This should be the most common branch, so it comes first
if(base[0] != ours[0] and base[0] != theirs[0] and ours[0] != theirs[0]) or \
(base[1] != ours[1] and base[1] != theirs[1] and ours[1] != theirs[1]):
# changed by both
out_append(_tree_entry_to_baseindexentry(base, 1))
out_append(_tree_entry_to_baseindexentry(ours, 2))
out_append(_tree_entry_to_baseindexentry(theirs, 3))
elif base[0] != ours[0] or base[1] != ours[1]:
# only we changed it
out_append(_tree_entry_to_baseindexentry(ours, 0))
else:
# either nobody changed it, or they did. In either
# case, use theirs
out_append(_tree_entry_to_baseindexentry(theirs, 0))
# END handle modification
else:
if ours[0] != base[0] or ours[1] != base[1]:
# they deleted it, we changed it, conflict
out_append(_tree_entry_to_baseindexentry(base, 1))
out_append(_tree_entry_to_baseindexentry(ours, 2))
# else:
# we didn't change it, ignore
# pass
# END handle our change
# END handle theirs
else:
if theirs is None:
# deleted in both, its fine - its out
pass
else:
if theirs[0] != base[0] or theirs[1] != base[1]:
# deleted in ours, changed theirs, conflict
out_append(_tree_entry_to_baseindexentry(base, 1))
out_append(_tree_entry_to_baseindexentry(theirs, 3))
# END theirs changed
# else:
# theirs didn't change
# pass
# END handle theirs
# END handle ours
else:
# all three can't be None
if ours is None:
# added in their branch
out_append(_tree_entry_to_baseindexentry(theirs, 0))
elif theirs is None:
# added in our branch
out_append(_tree_entry_to_baseindexentry(ours, 0))
else:
# both have it, except for the base, see whether it changed
if ours[0] != theirs[0] or ours[1] != theirs[1]:
out_append(_tree_entry_to_baseindexentry(ours, 2))
out_append(_tree_entry_to_baseindexentry(theirs, 3))
else:
# it was added the same in both
out_append(_tree_entry_to_baseindexentry(ours, 0))
# END handle two items
# END handle heads
# END handle base exists
# END for each entries tuple
return out

View File

@ -1,176 +0,0 @@
"""Module with additional types used by the index"""
from binascii import b2a_hex
from .util import (
pack,
unpack
)
from git.objects import Blob
__all__ = ('BlobFilter', 'BaseIndexEntry', 'IndexEntry')
#{ Invariants
CE_NAMEMASK = 0x0fff
CE_STAGEMASK = 0x3000
CE_EXTENDED = 0x4000
CE_VALID = 0x8000
CE_STAGESHIFT = 12
#} END invariants
class BlobFilter(object):
"""
Predicate to be used by iter_blobs allowing to filter only return blobs which
match the given list of directories or files.
The given paths are given relative to the repository.
"""
__slots__ = 'paths'
def __init__(self, paths):
"""
:param paths:
tuple or list of paths which are either pointing to directories or
to files relative to the current repository
"""
self.paths = paths
def __call__(self, stage_blob):
path = stage_blob[1].path
for p in self.paths:
if path.startswith(p):
return True
# END for each path in filter paths
return False
class BaseIndexEntry(tuple):
"""Small Brother of an index entry which can be created to describe changes
done to the index in which case plenty of additional information is not required.
As the first 4 data members match exactly to the IndexEntry type, methods
expecting a BaseIndexEntry can also handle full IndexEntries even if they
use numeric indices for performance reasons. """
def __str__(self):
return "%o %s %i\t%s" % (self.mode, self.hexsha, self.stage, self.path)
def __repr__(self):
return "(%o, %s, %i, %s)" % (self.mode, self.hexsha, self.stage, self.path)
@property
def mode(self):
""" File Mode, compatible to stat module constants """
return self[0]
@property
def binsha(self):
"""binary sha of the blob """
return self[1]
@property
def hexsha(self):
"""hex version of our sha"""
return b2a_hex(self[1]).decode('ascii')
@property
def stage(self):
"""Stage of the entry, either:
* 0 = default stage
* 1 = stage before a merge or common ancestor entry in case of a 3 way merge
* 2 = stage of entries from the 'left' side of the merge
* 3 = stage of entries from the right side of the merge
:note: For more information, see http://www.kernel.org/pub/software/scm/git/docs/git-read-tree.html
"""
return (self[2] & CE_STAGEMASK) >> CE_STAGESHIFT
@property
def path(self):
""":return: our path relative to the repository working tree root"""
return self[3]
@property
def flags(self):
""":return: flags stored with this entry"""
return self[2]
@classmethod
def from_blob(cls, blob, stage=0):
""":return: Fully equipped BaseIndexEntry at the given stage"""
return cls((blob.mode, blob.binsha, stage << CE_STAGESHIFT, blob.path))
def to_blob(self, repo):
""":return: Blob using the information of this index entry"""
return Blob(repo, self.binsha, self.mode, self.path)
class IndexEntry(BaseIndexEntry):
"""Allows convenient access to IndexEntry data without completely unpacking it.
Attributes usully accessed often are cached in the tuple whereas others are
unpacked on demand.
See the properties for a mapping between names and tuple indices. """
@property
def ctime(self):
"""
:return:
Tuple(int_time_seconds_since_epoch, int_nano_seconds) of the
file's creation time"""
return unpack(">LL", self[4])
@property
def mtime(self):
"""See ctime property, but returns modification time """
return unpack(">LL", self[5])
@property
def dev(self):
""" Device ID """
return self[6]
@property
def inode(self):
""" Inode ID """
return self[7]
@property
def uid(self):
""" User ID """
return self[8]
@property
def gid(self):
""" Group ID """
return self[9]
@property
def size(self):
""":return: Uncompressed size of the blob """
return self[10]
@classmethod
def from_base(cls, base):
"""
:return:
Minimal entry as created from the given BaseIndexEntry instance.
Missing values will be set to null-like values
:param base: Instance of type BaseIndexEntry"""
time = pack(">LL", 0, 0)
return IndexEntry((base.mode, base.binsha, base.flags, base.path, time, time, 0, 0, 0, 0, 0))
@classmethod
def from_blob(cls, blob, stage=0):
""":return: Minimal entry resembling the given blob object"""
time = pack(">LL", 0, 0)
return IndexEntry((blob.mode, blob.binsha, stage << CE_STAGESHIFT, blob.path,
time, time, 0, 0, 0, 0, blob.size))

View File

@ -1,99 +0,0 @@
"""Module containing index utilities"""
from functools import wraps
import os
import struct
import tempfile
from git.compat import is_win
import os.path as osp
__all__ = ('TemporaryFileSwap', 'post_clear_cache', 'default_index', 'git_working_dir')
#{ Aliases
pack = struct.pack
unpack = struct.unpack
#} END aliases
class TemporaryFileSwap(object):
"""Utility class moving a file to a temporary location within the same directory
and moving it back on to where on object deletion."""
__slots__ = ("file_path", "tmp_file_path")
def __init__(self, file_path):
self.file_path = file_path
self.tmp_file_path = self.file_path + tempfile.mktemp('', '', '')
# it may be that the source does not exist
try:
os.rename(self.file_path, self.tmp_file_path)
except OSError:
pass
def __del__(self):
if osp.isfile(self.tmp_file_path):
if is_win and osp.exists(self.file_path):
os.remove(self.file_path)
os.rename(self.tmp_file_path, self.file_path)
# END temp file exists
#{ Decorators
def post_clear_cache(func):
"""Decorator for functions that alter the index using the git command. This would
invalidate our possibly existing entries dictionary which is why it must be
deleted to allow it to be lazily reread later.
:note:
This decorator will not be required once all functions are implemented
natively which in fact is possible, but probably not feasible performance wise.
"""
@wraps(func)
def post_clear_cache_if_not_raised(self, *args, **kwargs):
rval = func(self, *args, **kwargs)
self._delete_entries_cache()
return rval
# END wrapper method
return post_clear_cache_if_not_raised
def default_index(func):
"""Decorator assuring the wrapped method may only run if we are the default
repository index. This is as we rely on git commands that operate
on that index only. """
@wraps(func)
def check_default_index(self, *args, **kwargs):
if self._file_path != self._index_path():
raise AssertionError(
"Cannot call %r on indices that do not represent the default git index" % func.__name__)
return func(self, *args, **kwargs)
# END wrapper method
return check_default_index
def git_working_dir(func):
"""Decorator which changes the current working dir to the one of the git
repository in order to assure relative paths are handled correctly"""
@wraps(func)
def set_git_working_dir(self, *args, **kwargs):
cur_wd = os.getcwd()
os.chdir(self.repo.working_tree_dir)
try:
return func(self, *args, **kwargs)
finally:
os.chdir(cur_wd)
# END handle working dir
# END wrapper
return set_git_working_dir
#} END decorators

View File

@ -1,26 +0,0 @@
"""
Import all submodules main classes into the package space
"""
# flake8: noqa
from __future__ import absolute_import
import inspect
from .base import *
from .blob import *
from .commit import *
from .submodule import util as smutil
from .submodule.base import *
from .submodule.root import *
from .tag import *
from .tree import *
# Fix import dependency - add IndexObject to the util module, so that it can be
# imported by the submodule.base
smutil.IndexObject = IndexObject
smutil.Object = Object
del(smutil)
# must come after submodule was made available
__all__ = [name for name, obj in locals().items()
if not (name.startswith('_') or inspect.ismodule(obj))]

View File

@ -1,182 +0,0 @@
# base.py
# Copyright (C) 2008, 2009 Michael Trier (mtrier@gmail.com) and contributors
#
# This module is part of GitPython and is released under
# the BSD License: http://www.opensource.org/licenses/bsd-license.php
from git.util import LazyMixin, join_path_native, stream_copy, bin_to_hex
import gitdb.typ as dbtyp
import os.path as osp
from .util import get_object_type_by_name
_assertion_msg_format = "Created object %r whose python type %r disagrees with the acutal git object type %r"
__all__ = ("Object", "IndexObject")
class Object(LazyMixin):
"""Implements an Object which may be Blobs, Trees, Commits and Tags"""
NULL_HEX_SHA = '0' * 40
NULL_BIN_SHA = b'\0' * 20
TYPES = (dbtyp.str_blob_type, dbtyp.str_tree_type, dbtyp.str_commit_type, dbtyp.str_tag_type)
__slots__ = ("repo", "binsha", "size")
type = None # to be set by subclass
def __init__(self, repo, binsha):
"""Initialize an object by identifying it by its binary sha.
All keyword arguments will be set on demand if None.
:param repo: repository this object is located in
:param binsha: 20 byte SHA1"""
super(Object, self).__init__()
self.repo = repo
self.binsha = binsha
assert len(binsha) == 20, "Require 20 byte binary sha, got %r, len = %i" % (binsha, len(binsha))
@classmethod
def new(cls, repo, id): # @ReservedAssignment
"""
:return: New Object instance of a type appropriate to the object type behind
id. The id of the newly created object will be a binsha even though
the input id may have been a Reference or Rev-Spec
:param id: reference, rev-spec, or hexsha
:note: This cannot be a __new__ method as it would always call __init__
with the input id which is not necessarily a binsha."""
return repo.rev_parse(str(id))
@classmethod
def new_from_sha(cls, repo, sha1):
"""
:return: new object instance of a type appropriate to represent the given
binary sha1
:param sha1: 20 byte binary sha1"""
if sha1 == cls.NULL_BIN_SHA:
# the NULL binsha is always the root commit
return get_object_type_by_name(b'commit')(repo, sha1)
# END handle special case
oinfo = repo.odb.info(sha1)
inst = get_object_type_by_name(oinfo.type)(repo, oinfo.binsha)
inst.size = oinfo.size
return inst
def _set_cache_(self, attr):
"""Retrieve object information"""
if attr == "size":
oinfo = self.repo.odb.info(self.binsha)
self.size = oinfo.size
# assert oinfo.type == self.type, _assertion_msg_format % (self.binsha, oinfo.type, self.type)
else:
super(Object, self)._set_cache_(attr)
def __eq__(self, other):
""":return: True if the objects have the same SHA1"""
if not hasattr(other, 'binsha'):
return False
return self.binsha == other.binsha
def __ne__(self, other):
""":return: True if the objects do not have the same SHA1 """
if not hasattr(other, 'binsha'):
return True
return self.binsha != other.binsha
def __hash__(self):
""":return: Hash of our id allowing objects to be used in dicts and sets"""
return hash(self.binsha)
def __str__(self):
""":return: string of our SHA1 as understood by all git commands"""
return self.hexsha
def __repr__(self):
""":return: string with pythonic representation of our object"""
return '<git.%s "%s">' % (self.__class__.__name__, self.hexsha)
@property
def hexsha(self):
""":return: 40 byte hex version of our 20 byte binary sha"""
# b2a_hex produces bytes
return bin_to_hex(self.binsha).decode('ascii')
@property
def data_stream(self):
""" :return: File Object compatible stream to the uncompressed raw data of the object
:note: returned streams must be read in order"""
return self.repo.odb.stream(self.binsha)
def stream_data(self, ostream):
"""Writes our data directly to the given output stream
:param ostream: File object compatible stream object.
:return: self"""
istream = self.repo.odb.stream(self.binsha)
stream_copy(istream, ostream)
return self
class IndexObject(Object):
"""Base for all objects that can be part of the index file , namely Tree, Blob and
SubModule objects"""
__slots__ = ("path", "mode")
# for compatibility with iterable lists
_id_attribute_ = 'path'
def __init__(self, repo, binsha, mode=None, path=None):
"""Initialize a newly instanced IndexObject
:param repo: is the Repo we are located in
:param binsha: 20 byte sha1
:param mode:
is the stat compatible file mode as int, use the stat module
to evaluate the information
:param path:
is the path to the file in the file system, relative to the git repository root, i.e.
file.ext or folder/other.ext
:note:
Path may not be set of the index object has been created directly as it cannot
be retrieved without knowing the parent tree."""
super(IndexObject, self).__init__(repo, binsha)
if mode is not None:
self.mode = mode
if path is not None:
self.path = path
def __hash__(self):
"""
:return:
Hash of our path as index items are uniquely identifiable by path, not
by their data !"""
return hash(self.path)
def _set_cache_(self, attr):
if attr in IndexObject.__slots__:
# they cannot be retrieved lateron ( not without searching for them )
raise AttributeError(
"Attribute '%s' unset: path and mode attributes must have been set during %s object creation"
% (attr, type(self).__name__))
else:
super(IndexObject, self)._set_cache_(attr)
# END handle slot attribute
@property
def name(self):
""":return: Name portion of the path, effectively being the basename"""
return osp.basename(self.path)
@property
def abspath(self):
"""
:return:
Absolute path to this index object in the file system ( as opposed to the
.path field which is a path relative to the git repository ).
The returned path will be native to the system and contains '\' on windows. """
return join_path_native(self.repo.working_tree_dir, self.path)

View File

@ -1,33 +0,0 @@
# blob.py
# Copyright (C) 2008, 2009 Michael Trier (mtrier@gmail.com) and contributors
#
# This module is part of GitPython and is released under
# the BSD License: http://www.opensource.org/licenses/bsd-license.php
from mimetypes import guess_type
from . import base
__all__ = ('Blob', )
class Blob(base.IndexObject):
"""A Blob encapsulates a git blob object"""
DEFAULT_MIME_TYPE = "text/plain"
type = "blob"
# valid blob modes
executable_mode = 0o100755
file_mode = 0o100644
link_mode = 0o120000
__slots__ = ()
@property
def mime_type(self):
"""
:return: String describing the mime type of this file (based on the filename)
:note: Defaults to 'text/plain' in case the actual file type is unknown. """
guesses = None
if self.path:
guesses = guess_type(self.path)
return guesses and guesses[0] or self.DEFAULT_MIME_TYPE

View File

@ -1,532 +0,0 @@
# commit.py
# Copyright (C) 2008, 2009 Michael Trier (mtrier@gmail.com) and contributors
#
# This module is part of GitPython and is released under
# the BSD License: http://www.opensource.org/licenses/bsd-license.php
from gitdb import IStream
from git.util import (
hex_to_bin,
Actor,
Iterable,
Stats,
finalize_process
)
from git.diff import Diffable
from .tree import Tree
from . import base
from .util import (
Traversable,
Serializable,
parse_date,
altz_to_utctz_str,
parse_actor_and_date,
from_timestamp,
)
from time import (
time,
daylight,
altzone,
timezone,
localtime
)
import os
from io import BytesIO
import logging
log = logging.getLogger('git.objects.commit')
log.addHandler(logging.NullHandler())
__all__ = ('Commit', )
class Commit(base.Object, Iterable, Diffable, Traversable, Serializable):
"""Wraps a git Commit object.
This class will act lazily on some of its attributes and will query the
value on demand only if it involves calling the git binary."""
# ENVIRONMENT VARIABLES
# read when creating new commits
env_author_date = "GIT_AUTHOR_DATE"
env_committer_date = "GIT_COMMITTER_DATE"
# CONFIGURATION KEYS
conf_encoding = 'i18n.commitencoding'
# INVARIANTS
default_encoding = "UTF-8"
# object configuration
type = "commit"
__slots__ = ("tree",
"author", "authored_date", "author_tz_offset",
"committer", "committed_date", "committer_tz_offset",
"message", "parents", "encoding", "gpgsig")
_id_attribute_ = "hexsha"
def __init__(self, repo, binsha, tree=None, author=None, authored_date=None, author_tz_offset=None,
committer=None, committed_date=None, committer_tz_offset=None,
message=None, parents=None, encoding=None, gpgsig=None):
"""Instantiate a new Commit. All keyword arguments taking None as default will
be implicitly set on first query.
:param binsha: 20 byte sha1
:param parents: tuple( Commit, ... )
is a tuple of commit ids or actual Commits
:param tree: Tree
Tree object
:param author: Actor
is the author Actor object
:param authored_date: int_seconds_since_epoch
is the authored DateTime - use time.gmtime() to convert it into a
different format
:param author_tz_offset: int_seconds_west_of_utc
is the timezone that the authored_date is in
:param committer: Actor
is the committer string
:param committed_date: int_seconds_since_epoch
is the committed DateTime - use time.gmtime() to convert it into a
different format
:param committer_tz_offset: int_seconds_west_of_utc
is the timezone that the committed_date is in
:param message: string
is the commit message
:param encoding: string
encoding of the message, defaults to UTF-8
:param parents:
List or tuple of Commit objects which are our parent(s) in the commit
dependency graph
:return: git.Commit
:note:
Timezone information is in the same format and in the same sign
as what time.altzone returns. The sign is inverted compared to git's
UTC timezone."""
super(Commit, self).__init__(repo, binsha)
if tree is not None:
assert isinstance(tree, Tree), "Tree needs to be a Tree instance, was %s" % type(tree)
if tree is not None:
self.tree = tree
if author is not None:
self.author = author
if authored_date is not None:
self.authored_date = authored_date
if author_tz_offset is not None:
self.author_tz_offset = author_tz_offset
if committer is not None:
self.committer = committer
if committed_date is not None:
self.committed_date = committed_date
if committer_tz_offset is not None:
self.committer_tz_offset = committer_tz_offset
if message is not None:
self.message = message
if parents is not None:
self.parents = parents
if encoding is not None:
self.encoding = encoding
if gpgsig is not None:
self.gpgsig = gpgsig
@classmethod
def _get_intermediate_items(cls, commit):
return commit.parents
def _set_cache_(self, attr):
if attr in Commit.__slots__:
# read the data in a chunk, its faster - then provide a file wrapper
_binsha, _typename, self.size, stream = self.repo.odb.stream(self.binsha)
self._deserialize(BytesIO(stream.read()))
else:
super(Commit, self)._set_cache_(attr)
# END handle attrs
@property
def authored_datetime(self):
return from_timestamp(self.authored_date, self.author_tz_offset)
@property
def committed_datetime(self):
return from_timestamp(self.committed_date, self.committer_tz_offset)
@property
def summary(self):
""":return: First line of the commit message"""
return self.message.split('\n', 1)[0]
def count(self, paths='', **kwargs):
"""Count the number of commits reachable from this commit
:param paths:
is an optional path or a list of paths restricting the return value
to commits actually containing the paths
:param kwargs:
Additional options to be passed to git-rev-list. They must not alter
the output style of the command, or parsing will yield incorrect results
:return: int defining the number of reachable commits"""
# yes, it makes a difference whether empty paths are given or not in our case
# as the empty paths version will ignore merge commits for some reason.
if paths:
return len(self.repo.git.rev_list(self.hexsha, '--', paths, **kwargs).splitlines())
return len(self.repo.git.rev_list(self.hexsha, **kwargs).splitlines())
@property
def name_rev(self):
"""
:return:
String describing the commits hex sha based on the closest Reference.
Mostly useful for UI purposes"""
return self.repo.git.name_rev(self)
@classmethod
def iter_items(cls, repo, rev, paths='', **kwargs):
"""Find all commits matching the given criteria.
:param repo: is the Repo
:param rev: revision specifier, see git-rev-parse for viable options
:param paths:
is an optional path or list of paths, if set only Commits that include the path
or paths will be considered
:param kwargs:
optional keyword arguments to git rev-list where
``max_count`` is the maximum number of commits to fetch
``skip`` is the number of commits to skip
``since`` all commits since i.e. '1970-01-01'
:return: iterator yielding Commit items"""
if 'pretty' in kwargs:
raise ValueError("--pretty cannot be used as parsing expects single sha's only")
# END handle pretty
# use -- in any case, to prevent possibility of ambiguous arguments
# see https://github.com/gitpython-developers/GitPython/issues/264
args = ['--']
if paths:
args.extend((paths, ))
# END if paths
proc = repo.git.rev_list(rev, args, as_process=True, **kwargs)
return cls._iter_from_process_or_stream(repo, proc)
def iter_parents(self, paths='', **kwargs):
"""Iterate _all_ parents of this commit.
:param paths:
Optional path or list of paths limiting the Commits to those that
contain at least one of the paths
:param kwargs: All arguments allowed by git-rev-list
:return: Iterator yielding Commit objects which are parents of self """
# skip ourselves
skip = kwargs.get("skip", 1)
if skip == 0: # skip ourselves
skip = 1
kwargs['skip'] = skip
return self.iter_items(self.repo, self, paths, **kwargs)
@property
def stats(self):
"""Create a git stat from changes between this commit and its first parent
or from all changes done if this is the very first commit.
:return: git.Stats"""
if not self.parents:
text = self.repo.git.diff_tree(self.hexsha, '--', numstat=True, root=True)
text2 = ""
for line in text.splitlines()[1:]:
(insertions, deletions, filename) = line.split("\t")
text2 += "%s\t%s\t%s\n" % (insertions, deletions, filename)
text = text2
else:
text = self.repo.git.diff(self.parents[0].hexsha, self.hexsha, '--', numstat=True)
return Stats._list_from_string(self.repo, text)
@classmethod
def _iter_from_process_or_stream(cls, repo, proc_or_stream):
"""Parse out commit information into a list of Commit objects
We expect one-line per commit, and parse the actual commit information directly
from our lighting fast object database
:param proc: git-rev-list process instance - one sha per line
:return: iterator returning Commit objects"""
stream = proc_or_stream
if not hasattr(stream, 'readline'):
stream = proc_or_stream.stdout
readline = stream.readline
while True:
line = readline()
if not line:
break
hexsha = line.strip()
if len(hexsha) > 40:
# split additional information, as returned by bisect for instance
hexsha, _ = line.split(None, 1)
# END handle extra info
assert len(hexsha) == 40, "Invalid line: %s" % hexsha
yield Commit(repo, hex_to_bin(hexsha))
# END for each line in stream
# TODO: Review this - it seems process handling got a bit out of control
# due to many developers trying to fix the open file handles issue
if hasattr(proc_or_stream, 'wait'):
finalize_process(proc_or_stream)
@classmethod
def create_from_tree(cls, repo, tree, message, parent_commits=None, head=False, author=None, committer=None,
author_date=None, commit_date=None):
"""Commit the given tree, creating a commit object.
:param repo: Repo object the commit should be part of
:param tree: Tree object or hex or bin sha
the tree of the new commit
:param message: Commit message. It may be an empty string if no message is provided.
It will be converted to a string in any case.
:param parent_commits:
Optional Commit objects to use as parents for the new commit.
If empty list, the commit will have no parents at all and become
a root commit.
If None , the current head commit will be the parent of the
new commit object
:param head:
If True, the HEAD will be advanced to the new commit automatically.
Else the HEAD will remain pointing on the previous commit. This could
lead to undesired results when diffing files.
:param author: The name of the author, optional. If unset, the repository
configuration is used to obtain this value.
:param committer: The name of the committer, optional. If unset, the
repository configuration is used to obtain this value.
:param author_date: The timestamp for the author field
:param commit_date: The timestamp for the committer field
:return: Commit object representing the new commit
:note:
Additional information about the committer and Author are taken from the
environment or from the git configuration, see git-commit-tree for
more information"""
if parent_commits is None:
try:
parent_commits = [repo.head.commit]
except ValueError:
# empty repositories have no head commit
parent_commits = []
# END handle parent commits
else:
for p in parent_commits:
if not isinstance(p, cls):
raise ValueError("Parent commit '%r' must be of type %s" % (p, cls))
# end check parent commit types
# END if parent commits are unset
# retrieve all additional information, create a commit object, and
# serialize it
# Generally:
# * Environment variables override configuration values
# * Sensible defaults are set according to the git documentation
# COMMITER AND AUTHOR INFO
cr = repo.config_reader()
env = os.environ
committer = committer or Actor.committer(cr)
author = author or Actor.author(cr)
# PARSE THE DATES
unix_time = int(time())
is_dst = daylight and localtime().tm_isdst > 0
offset = altzone if is_dst else timezone
author_date_str = env.get(cls.env_author_date, '')
if author_date:
author_time, author_offset = parse_date(author_date)
elif author_date_str:
author_time, author_offset = parse_date(author_date_str)
else:
author_time, author_offset = unix_time, offset
# END set author time
committer_date_str = env.get(cls.env_committer_date, '')
if commit_date:
committer_time, committer_offset = parse_date(commit_date)
elif committer_date_str:
committer_time, committer_offset = parse_date(committer_date_str)
else:
committer_time, committer_offset = unix_time, offset
# END set committer time
# assume utf8 encoding
enc_section, enc_option = cls.conf_encoding.split('.')
conf_encoding = cr.get_value(enc_section, enc_option, cls.default_encoding)
# if the tree is no object, make sure we create one - otherwise
# the created commit object is invalid
if isinstance(tree, str):
tree = repo.tree(tree)
# END tree conversion
# CREATE NEW COMMIT
new_commit = cls(repo, cls.NULL_BIN_SHA, tree,
author, author_time, author_offset,
committer, committer_time, committer_offset,
message, parent_commits, conf_encoding)
stream = BytesIO()
new_commit._serialize(stream)
streamlen = stream.tell()
stream.seek(0)
istream = repo.odb.store(IStream(cls.type, streamlen, stream))
new_commit.binsha = istream.binsha
if head:
# need late import here, importing git at the very beginning throws
# as well ...
import git.refs
try:
repo.head.set_commit(new_commit, logmsg=message)
except ValueError:
# head is not yet set to the ref our HEAD points to
# Happens on first commit
master = git.refs.Head.create(repo, repo.head.ref, new_commit, logmsg="commit (initial): %s" % message)
repo.head.set_reference(master, logmsg='commit: Switching to %s' % master)
# END handle empty repositories
# END advance head handling
return new_commit
#{ Serializable Implementation
def _serialize(self, stream):
write = stream.write
write(("tree %s\n" % self.tree).encode('ascii'))
for p in self.parents:
write(("parent %s\n" % p).encode('ascii'))
a = self.author
aname = a.name
c = self.committer
fmt = "%s %s <%s> %s %s\n"
write((fmt % ("author", aname, a.email,
self.authored_date,
altz_to_utctz_str(self.author_tz_offset))).encode(self.encoding))
# encode committer
aname = c.name
write((fmt % ("committer", aname, c.email,
self.committed_date,
altz_to_utctz_str(self.committer_tz_offset))).encode(self.encoding))
if self.encoding != self.default_encoding:
write(("encoding %s\n" % self.encoding).encode('ascii'))
try:
if self.__getattribute__('gpgsig') is not None:
write(b"gpgsig")
for sigline in self.gpgsig.rstrip("\n").split("\n"):
write((" " + sigline + "\n").encode('ascii'))
except AttributeError:
pass
write(b"\n")
# write plain bytes, be sure its encoded according to our encoding
if isinstance(self.message, str):
write(self.message.encode(self.encoding))
else:
write(self.message)
# END handle encoding
return self
def _deserialize(self, stream):
""":param from_rev_list: if true, the stream format is coming from the rev-list command
Otherwise it is assumed to be a plain data stream from our object"""
readline = stream.readline
self.tree = Tree(self.repo, hex_to_bin(readline().split()[1]), Tree.tree_id << 12, '')
self.parents = []
next_line = None
while True:
parent_line = readline()
if not parent_line.startswith(b'parent'):
next_line = parent_line
break
# END abort reading parents
self.parents.append(type(self)(self.repo, hex_to_bin(parent_line.split()[-1].decode('ascii'))))
# END for each parent line
self.parents = tuple(self.parents)
# we don't know actual author encoding before we have parsed it, so keep the lines around
author_line = next_line
committer_line = readline()
# we might run into one or more mergetag blocks, skip those for now
next_line = readline()
while next_line.startswith(b'mergetag '):
next_line = readline()
while next_line.startswith(b' '):
next_line = readline()
# end skip mergetags
# now we can have the encoding line, or an empty line followed by the optional
# message.
self.encoding = self.default_encoding
self.gpgsig = None
# read headers
enc = next_line
buf = enc.strip()
while buf:
if buf[0:10] == b"encoding ":
self.encoding = buf[buf.find(' ') + 1:].decode(
self.encoding, 'ignore')
elif buf[0:7] == b"gpgsig ":
sig = buf[buf.find(b' ') + 1:] + b"\n"
is_next_header = False
while True:
sigbuf = readline()
if not sigbuf:
break
if sigbuf[0:1] != b" ":
buf = sigbuf.strip()
is_next_header = True
break
sig += sigbuf[1:]
# end read all signature
self.gpgsig = sig.rstrip(b"\n").decode(self.encoding, 'ignore')
if is_next_header:
continue
buf = readline().strip()
# decode the authors name
try:
self.author, self.authored_date, self.author_tz_offset = \
parse_actor_and_date(author_line.decode(self.encoding, 'replace'))
except UnicodeDecodeError:
log.error("Failed to decode author line '%s' using encoding %s", author_line, self.encoding,
exc_info=True)
try:
self.committer, self.committed_date, self.committer_tz_offset = \
parse_actor_and_date(committer_line.decode(self.encoding, 'replace'))
except UnicodeDecodeError:
log.error("Failed to decode committer line '%s' using encoding %s", committer_line, self.encoding,
exc_info=True)
# END handle author's encoding
# a stream from our data simply gives us the plain message
# The end of our message stream is marked with a newline that we strip
self.message = stream.read()
try:
self.message = self.message.decode(self.encoding, 'replace')
except UnicodeDecodeError:
log.error("Failed to decode message '%s' using encoding %s", self.message, self.encoding, exc_info=True)
# END exception handling
return self
#} END serializable implementation

View File

@ -1,203 +0,0 @@
"""Module with functions which are supposed to be as fast as possible"""
from stat import S_ISDIR
from git.compat import (
safe_decode,
defenc
)
__all__ = ('tree_to_stream', 'tree_entries_from_data', 'traverse_trees_recursive',
'traverse_tree_recursive')
def tree_to_stream(entries, write):
"""Write the give list of entries into a stream using its write method
:param entries: **sorted** list of tuples with (binsha, mode, name)
:param write: write method which takes a data string"""
ord_zero = ord('0')
bit_mask = 7 # 3 bits set
for binsha, mode, name in entries:
mode_str = b''
for i in range(6):
mode_str = bytes([((mode >> (i * 3)) & bit_mask) + ord_zero]) + mode_str
# END for each 8 octal value
# git slices away the first octal if its zero
if mode_str[0] == ord_zero:
mode_str = mode_str[1:]
# END save a byte
# here it comes: if the name is actually unicode, the replacement below
# will not work as the binsha is not part of the ascii unicode encoding -
# hence we must convert to an utf8 string for it to work properly.
# According to my tests, this is exactly what git does, that is it just
# takes the input literally, which appears to be utf8 on linux.
if isinstance(name, str):
name = name.encode(defenc)
write(b''.join((mode_str, b' ', name, b'\0', binsha)))
# END for each item
def tree_entries_from_data(data):
"""Reads the binary representation of a tree and returns tuples of Tree items
:param data: data block with tree data (as bytes)
:return: list(tuple(binsha, mode, tree_relative_path), ...)"""
ord_zero = ord('0')
space_ord = ord(' ')
len_data = len(data)
i = 0
out = []
while i < len_data:
mode = 0
# read mode
# Some git versions truncate the leading 0, some don't
# The type will be extracted from the mode later
while data[i] != space_ord:
# move existing mode integer up one level being 3 bits
# and add the actual ordinal value of the character
mode = (mode << 3) + (data[i] - ord_zero)
i += 1
# END while reading mode
# byte is space now, skip it
i += 1
# parse name, it is NULL separated
ns = i
while data[i] != 0:
i += 1
# END while not reached NULL
# default encoding for strings in git is utf8
# Only use the respective unicode object if the byte stream was encoded
name = data[ns:i]
name = safe_decode(name)
# byte is NULL, get next 20
i += 1
sha = data[i:i + 20]
i = i + 20
out.append((sha, mode, name))
# END for each byte in data stream
return out
def _find_by_name(tree_data, name, is_dir, start_at):
"""return data entry matching the given name and tree mode
or None.
Before the item is returned, the respective data item is set
None in the tree_data list to mark it done"""
try:
item = tree_data[start_at]
if item and item[2] == name and S_ISDIR(item[1]) == is_dir:
tree_data[start_at] = None
return item
except IndexError:
pass
# END exception handling
for index, item in enumerate(tree_data):
if item and item[2] == name and S_ISDIR(item[1]) == is_dir:
tree_data[index] = None
return item
# END if item matches
# END for each item
return None
def _to_full_path(item, path_prefix):
"""Rebuild entry with given path prefix"""
if not item:
return item
return (item[0], item[1], path_prefix + item[2])
def traverse_trees_recursive(odb, tree_shas, path_prefix):
"""
:return: list with entries according to the given binary tree-shas.
The result is encoded in a list
of n tuple|None per blob/commit, (n == len(tree_shas)), where
* [0] == 20 byte sha
* [1] == mode as int
* [2] == path relative to working tree root
The entry tuple is None if the respective blob/commit did not
exist in the given tree.
:param tree_shas: iterable of shas pointing to trees. All trees must
be on the same level. A tree-sha may be None in which case None
:param path_prefix: a prefix to be added to the returned paths on this level,
set it '' for the first iteration
:note: The ordering of the returned items will be partially lost"""
trees_data = []
nt = len(tree_shas)
for tree_sha in tree_shas:
if tree_sha is None:
data = []
else:
data = tree_entries_from_data(odb.stream(tree_sha).read())
# END handle muted trees
trees_data.append(data)
# END for each sha to get data for
out = []
out_append = out.append
# find all matching entries and recursively process them together if the match
# is a tree. If the match is a non-tree item, put it into the result.
# Processed items will be set None
for ti, tree_data in enumerate(trees_data):
for ii, item in enumerate(tree_data):
if not item:
continue
# END skip already done items
entries = [None for _ in range(nt)]
entries[ti] = item
_sha, mode, name = item
is_dir = S_ISDIR(mode) # type mode bits
# find this item in all other tree data items
# wrap around, but stop one before our current index, hence
# ti+nt, not ti+1+nt
for tio in range(ti + 1, ti + nt):
tio = tio % nt
entries[tio] = _find_by_name(trees_data[tio], name, is_dir, ii)
# END for each other item data
# if we are a directory, enter recursion
if is_dir:
out.extend(traverse_trees_recursive(
odb, [((ei and ei[0]) or None) for ei in entries], path_prefix + name + '/'))
else:
out_append(tuple(_to_full_path(e, path_prefix) for e in entries))
# END handle recursion
# finally mark it done
tree_data[ii] = None
# END for each item
# we are done with one tree, set all its data empty
del(tree_data[:])
# END for each tree_data chunk
return out
def traverse_tree_recursive(odb, tree_sha, path_prefix):
"""
:return: list of entries of the tree pointed to by the binary tree_sha. An entry
has the following format:
* [0] 20 byte sha
* [1] mode as int
* [2] path relative to the repository
:param path_prefix: prefix to prepend to the front of all returned paths"""
entries = []
data = tree_entries_from_data(odb.stream(tree_sha).read())
# unpacking/packing is faster than accessing individual items
for sha, mode, name in data:
if S_ISDIR(mode):
entries.extend(traverse_tree_recursive(odb, sha, path_prefix + name + '/'))
else:
entries.append((sha, mode, path_prefix + name))
# END for each item
return entries

View File

@ -1,2 +0,0 @@
# NOTE: Cannot import anything here as the top-level _init_ has to handle
# our dependencies

View File

@ -1,350 +0,0 @@
from .base import (
Submodule,
UpdateProgress
)
from .util import (
find_first_remote_branch
)
from git.exc import InvalidGitRepositoryError
import git
import logging
__all__ = ["RootModule", "RootUpdateProgress"]
log = logging.getLogger('git.objects.submodule.root')
log.addHandler(logging.NullHandler())
class RootUpdateProgress(UpdateProgress):
"""Utility class which adds more opcodes to the UpdateProgress"""
REMOVE, PATHCHANGE, BRANCHCHANGE, URLCHANGE = [
1 << x for x in range(UpdateProgress._num_op_codes, UpdateProgress._num_op_codes + 4)]
_num_op_codes = UpdateProgress._num_op_codes + 4
__slots__ = ()
BEGIN = RootUpdateProgress.BEGIN
END = RootUpdateProgress.END
REMOVE = RootUpdateProgress.REMOVE
BRANCHCHANGE = RootUpdateProgress.BRANCHCHANGE
URLCHANGE = RootUpdateProgress.URLCHANGE
PATHCHANGE = RootUpdateProgress.PATHCHANGE
class RootModule(Submodule):
"""A (virtual) Root of all submodules in the given repository. It can be used
to more easily traverse all submodules of the master repository"""
__slots__ = ()
k_root_name = '__ROOT__'
def __init__(self, repo):
# repo, binsha, mode=None, path=None, name = None, parent_commit=None, url=None, ref=None)
super(RootModule, self).__init__(
repo,
binsha=self.NULL_BIN_SHA,
mode=self.k_default_mode,
path='',
name=self.k_root_name,
parent_commit=repo.head.commit,
url='',
branch_path=git.Head.to_full_path(self.k_head_default)
)
def _clear_cache(self):
"""May not do anything"""
pass
#{ Interface
def update(self, previous_commit=None, recursive=True, force_remove=False, init=True,
to_latest_revision=False, progress=None, dry_run=False, force_reset=False,
keep_going=False):
"""Update the submodules of this repository to the current HEAD commit.
This method behaves smartly by determining changes of the path of a submodules
repository, next to changes to the to-be-checked-out commit or the branch to be
checked out. This works if the submodules ID does not change.
Additionally it will detect addition and removal of submodules, which will be handled
gracefully.
:param previous_commit: If set to a commit'ish, the commit we should use
as the previous commit the HEAD pointed to before it was set to the commit it points to now.
If None, it defaults to HEAD@{1} otherwise
:param recursive: if True, the children of submodules will be updated as well
using the same technique
:param force_remove: If submodules have been deleted, they will be forcibly removed.
Otherwise the update may fail if a submodule's repository cannot be deleted as
changes have been made to it (see Submodule.update() for more information)
:param init: If we encounter a new module which would need to be initialized, then do it.
:param to_latest_revision: If True, instead of checking out the revision pointed to
by this submodule's sha, the checked out tracking branch will be merged with the
latest remote branch fetched from the repository's origin.
Unless force_reset is specified, a local tracking branch will never be reset into its past, therefore
the remote branch must be in the future for this to have an effect.
:param force_reset: if True, submodules may checkout or reset their branch even if the repository has
pending changes that would be overwritten, or if the local tracking branch is in the future of the
remote tracking branch and would be reset into its past.
:param progress: RootUpdateProgress instance or None if no progress should be sent
:param dry_run: if True, operations will not actually be performed. Progress messages
will change accordingly to indicate the WOULD DO state of the operation.
:param keep_going: if True, we will ignore but log all errors, and keep going recursively.
Unless dry_run is set as well, keep_going could cause subsequent/inherited errors you wouldn't see
otherwise.
In conjunction with dry_run, it can be useful to anticipate all errors when updating submodules
:return: self"""
if self.repo.bare:
raise InvalidGitRepositoryError("Cannot update submodules in bare repositories")
# END handle bare
if progress is None:
progress = RootUpdateProgress()
# END assure progress is set
prefix = ''
if dry_run:
prefix = 'DRY-RUN: '
repo = self.repo
try:
# SETUP BASE COMMIT
###################
cur_commit = repo.head.commit
if previous_commit is None:
try:
previous_commit = repo.commit(repo.head.log_entry(-1).oldhexsha)
if previous_commit.binsha == previous_commit.NULL_BIN_SHA:
raise IndexError
# END handle initial commit
except IndexError:
# in new repositories, there is no previous commit
previous_commit = cur_commit
# END exception handling
else:
previous_commit = repo.commit(previous_commit) # obtain commit object
# END handle previous commit
psms = self.list_items(repo, parent_commit=previous_commit)
sms = self.list_items(repo)
spsms = set(psms)
ssms = set(sms)
# HANDLE REMOVALS
###################
rrsm = (spsms - ssms)
len_rrsm = len(rrsm)
for i, rsm in enumerate(rrsm):
op = REMOVE
if i == 0:
op |= BEGIN
# END handle begin
# fake it into thinking its at the current commit to allow deletion
# of previous module. Trigger the cache to be updated before that
progress.update(op, i, len_rrsm, prefix + "Removing submodule %r at %s" % (rsm.name, rsm.abspath))
rsm._parent_commit = repo.head.commit
rsm.remove(configuration=False, module=True, force=force_remove, dry_run=dry_run)
if i == len_rrsm - 1:
op |= END
# END handle end
progress.update(op, i, len_rrsm, prefix + "Done removing submodule %r" % rsm.name)
# END for each removed submodule
# HANDLE PATH RENAMES
#####################
# url changes + branch changes
csms = (spsms & ssms)
len_csms = len(csms)
for i, csm in enumerate(csms):
psm = psms[csm.name]
sm = sms[csm.name]
# PATH CHANGES
##############
if sm.path != psm.path and psm.module_exists():
progress.update(BEGIN | PATHCHANGE, i, len_csms, prefix +
"Moving repository of submodule %r from %s to %s"
% (sm.name, psm.abspath, sm.abspath))
# move the module to the new path
if not dry_run:
psm.move(sm.path, module=True, configuration=False)
# END handle dry_run
progress.update(
END | PATHCHANGE, i, len_csms, prefix + "Done moving repository of submodule %r" % sm.name)
# END handle path changes
if sm.module_exists():
# HANDLE URL CHANGE
###################
if sm.url != psm.url:
# Add the new remote, remove the old one
# This way, if the url just changes, the commits will not
# have to be re-retrieved
nn = '__new_origin__'
smm = sm.module()
rmts = smm.remotes
# don't do anything if we already have the url we search in place
if len([r for r in rmts if r.url == sm.url]) == 0:
progress.update(BEGIN | URLCHANGE, i, len_csms, prefix +
"Changing url of submodule %r from %s to %s" % (sm.name, psm.url, sm.url))
if not dry_run:
assert nn not in [r.name for r in rmts]
smr = smm.create_remote(nn, sm.url)
smr.fetch(progress=progress)
# If we have a tracking branch, it should be available
# in the new remote as well.
if len([r for r in smr.refs if r.remote_head == sm.branch_name]) == 0:
raise ValueError(
"Submodule branch named %r was not available in new submodule remote at %r"
% (sm.branch_name, sm.url)
)
# END head is not detached
# now delete the changed one
rmt_for_deletion = None
for remote in rmts:
if remote.url == psm.url:
rmt_for_deletion = remote
break
# END if urls match
# END for each remote
# if we didn't find a matching remote, but have exactly one,
# we can safely use this one
if rmt_for_deletion is None:
if len(rmts) == 1:
rmt_for_deletion = rmts[0]
else:
# if we have not found any remote with the original url
# we may not have a name. This is a special case,
# and its okay to fail here
# Alternatively we could just generate a unique name and leave all
# existing ones in place
raise InvalidGitRepositoryError(
"Couldn't find original remote-repo at url %r" % psm.url)
# END handle one single remote
# END handle check we found a remote
orig_name = rmt_for_deletion.name
smm.delete_remote(rmt_for_deletion)
# NOTE: Currently we leave tags from the deleted remotes
# as well as separate tracking branches in the possibly totally
# changed repository ( someone could have changed the url to
# another project ). At some point, one might want to clean
# it up, but the danger is high to remove stuff the user
# has added explicitly
# rename the new remote back to what it was
smr.rename(orig_name)
# early on, we verified that the our current tracking branch
# exists in the remote. Now we have to assure that the
# sha we point to is still contained in the new remote
# tracking branch.
smsha = sm.binsha
found = False
rref = smr.refs[self.branch_name]
for c in rref.commit.traverse():
if c.binsha == smsha:
found = True
break
# END traverse all commits in search for sha
# END for each commit
if not found:
# adjust our internal binsha to use the one of the remote
# this way, it will be checked out in the next step
# This will change the submodule relative to us, so
# the user will be able to commit the change easily
log.warning("Current sha %s was not contained in the tracking\
branch at the new remote, setting it the the remote's tracking branch", sm.hexsha)
sm.binsha = rref.commit.binsha
# END reset binsha
# NOTE: All checkout is performed by the base implementation of update
# END handle dry_run
progress.update(
END | URLCHANGE, i, len_csms, prefix + "Done adjusting url of submodule %r" % (sm.name))
# END skip remote handling if new url already exists in module
# END handle url
# HANDLE PATH CHANGES
#####################
if sm.branch_path != psm.branch_path:
# finally, create a new tracking branch which tracks the
# new remote branch
progress.update(BEGIN | BRANCHCHANGE, i, len_csms, prefix +
"Changing branch of submodule %r from %s to %s"
% (sm.name, psm.branch_path, sm.branch_path))
if not dry_run:
smm = sm.module()
smmr = smm.remotes
# As the branch might not exist yet, we will have to fetch all remotes to be sure ... .
for remote in smmr:
remote.fetch(progress=progress)
# end for each remote
try:
tbr = git.Head.create(smm, sm.branch_name, logmsg='branch: Created from HEAD')
except OSError:
# ... or reuse the existing one
tbr = git.Head(smm, sm.branch_path)
# END assure tracking branch exists
tbr.set_tracking_branch(find_first_remote_branch(smmr, sm.branch_name))
# NOTE: All head-resetting is done in the base implementation of update
# but we will have to checkout the new branch here. As it still points to the currently
# checkout out commit, we don't do any harm.
# As we don't want to update working-tree or index, changing the ref is all there is to do
smm.head.reference = tbr
# END handle dry_run
progress.update(
END | BRANCHCHANGE, i, len_csms, prefix + "Done changing branch of submodule %r" % sm.name)
# END handle branch
# END handle
# END for each common submodule
except Exception as err:
if not keep_going:
raise
log.error(str(err))
# end handle keep_going
# FINALLY UPDATE ALL ACTUAL SUBMODULES
######################################
for sm in sms:
# update the submodule using the default method
sm.update(recursive=False, init=init, to_latest_revision=to_latest_revision,
progress=progress, dry_run=dry_run, force=force_reset, keep_going=keep_going)
# update recursively depth first - question is which inconsitent
# state will be better in case it fails somewhere. Defective branch
# or defective depth. The RootSubmodule type will never process itself,
# which was done in the previous expression
if recursive:
# the module would exist by now if we are not in dry_run mode
if sm.module_exists():
type(self)(sm.module()).update(recursive=True, force_remove=force_remove,
init=init, to_latest_revision=to_latest_revision,
progress=progress, dry_run=dry_run, force_reset=force_reset,
keep_going=keep_going)
# END handle dry_run
# END handle recursive
# END for each submodule to update
return self
def module(self):
""":return: the actual repository containing the submodules"""
return self.repo
#} END interface
#} END classes

View File

@ -1,94 +0,0 @@
import git
from git.exc import InvalidGitRepositoryError
from git.config import GitConfigParser
from io import BytesIO
import weakref
__all__ = ('sm_section', 'sm_name', 'mkhead', 'find_first_remote_branch',
'SubmoduleConfigParser')
#{ Utilities
def sm_section(name):
""":return: section title used in .gitmodules configuration file"""
return 'submodule "%s"' % name
def sm_name(section):
""":return: name of the submodule as parsed from the section name"""
section = section.strip()
return section[11:-1]
def mkhead(repo, path):
""":return: New branch/head instance"""
return git.Head(repo, git.Head.to_full_path(path))
def find_first_remote_branch(remotes, branch_name):
"""Find the remote branch matching the name of the given branch or raise InvalidGitRepositoryError"""
for remote in remotes:
try:
return remote.refs[branch_name]
except IndexError:
continue
# END exception handling
# END for remote
raise InvalidGitRepositoryError("Didn't find remote branch '%r' in any of the given remotes" % branch_name)
#} END utilities
#{ Classes
class SubmoduleConfigParser(GitConfigParser):
"""
Catches calls to _write, and updates the .gitmodules blob in the index
with the new data, if we have written into a stream. Otherwise it will
add the local file to the index to make it correspond with the working tree.
Additionally, the cache must be cleared
Please note that no mutating method will work in bare mode
"""
def __init__(self, *args, **kwargs):
self._smref = None
self._index = None
self._auto_write = True
super(SubmoduleConfigParser, self).__init__(*args, **kwargs)
#{ Interface
def set_submodule(self, submodule):
"""Set this instance's submodule. It must be called before
the first write operation begins"""
self._smref = weakref.ref(submodule)
def flush_to_index(self):
"""Flush changes in our configuration file to the index"""
assert self._smref is not None
# should always have a file here
assert not isinstance(self._file_or_files, BytesIO)
sm = self._smref()
if sm is not None:
index = self._index
if index is None:
index = sm.repo.index
# END handle index
index.add([sm.k_modules_file], write=self._auto_write)
sm._clear_cache()
# END handle weakref
#} END interface
#{ Overridden Methods
def write(self):
rval = super(SubmoduleConfigParser, self).write()
self.flush_to_index()
return rval
# END overridden methods
#} END classes

View File

@ -1,75 +0,0 @@
# objects.py
# Copyright (C) 2008, 2009 Michael Trier (mtrier@gmail.com) and contributors
#
# This module is part of GitPython and is released under
# the BSD License: http://www.opensource.org/licenses/bsd-license.php
""" Module containing all object based types. """
from . import base
from .util import get_object_type_by_name, parse_actor_and_date
from ..util import hex_to_bin
from ..compat import defenc
__all__ = ("TagObject", )
class TagObject(base.Object):
"""Non-Lightweight tag carrying additional information about an object we are pointing to."""
type = "tag"
__slots__ = ("object", "tag", "tagger", "tagged_date", "tagger_tz_offset", "message")
def __init__(self, repo, binsha, object=None, tag=None, # @ReservedAssignment
tagger=None, tagged_date=None, tagger_tz_offset=None, message=None):
"""Initialize a tag object with additional data
:param repo: repository this object is located in
:param binsha: 20 byte SHA1
:param object: Object instance of object we are pointing to
:param tag: name of this tag
:param tagger: Actor identifying the tagger
:param tagged_date: int_seconds_since_epoch
is the DateTime of the tag creation - use time.gmtime to convert
it into a different format
:param tagged_tz_offset: int_seconds_west_of_utc is the timezone that the
authored_date is in, in a format similar to time.altzone"""
super(TagObject, self).__init__(repo, binsha)
if object is not None:
self.object = object
if tag is not None:
self.tag = tag
if tagger is not None:
self.tagger = tagger
if tagged_date is not None:
self.tagged_date = tagged_date
if tagger_tz_offset is not None:
self.tagger_tz_offset = tagger_tz_offset
if message is not None:
self.message = message
def _set_cache_(self, attr):
"""Cache all our attributes at once"""
if attr in TagObject.__slots__:
ostream = self.repo.odb.stream(self.binsha)
lines = ostream.read().decode(defenc, 'replace').splitlines()
_obj, hexsha = lines[0].split(" ")
_type_token, type_name = lines[1].split(" ")
self.object = \
get_object_type_by_name(type_name.encode('ascii'))(self.repo, hex_to_bin(hexsha))
self.tag = lines[2][4:] # tag <tag name>
if len(lines) > 3:
tagger_info = lines[3] # tagger <actor> <date>
self.tagger, self.tagged_date, self.tagger_tz_offset = parse_actor_and_date(tagger_info)
# line 4 empty - it could mark the beginning of the next header
# in case there really is no message, it would not exist. Otherwise
# a newline separates header from message
if len(lines) > 5:
self.message = "\n".join(lines[5:])
else:
self.message = ''
# END check our attributes
else:
super(TagObject, self)._set_cache_(attr)

View File

@ -1,337 +0,0 @@
# tree.py
# Copyright (C) 2008, 2009 Michael Trier (mtrier@gmail.com) and contributors
#
# This module is part of GitPython and is released under
# the BSD License: http://www.opensource.org/licenses/bsd-license.php
from git.util import join_path
import git.diff as diff
from git.util import to_bin_sha
from . import util
from .base import IndexObject
from .blob import Blob
from .submodule.base import Submodule
from .fun import (
tree_entries_from_data,
tree_to_stream
)
cmp = lambda a, b: (a > b) - (a < b)
__all__ = ("TreeModifier", "Tree")
def git_cmp(t1, t2):
a, b = t1[2], t2[2]
len_a, len_b = len(a), len(b)
min_len = min(len_a, len_b)
min_cmp = cmp(a[:min_len], b[:min_len])
if min_cmp:
return min_cmp
return len_a - len_b
def merge_sort(a, cmp):
if len(a) < 2:
return
mid = len(a) // 2
lefthalf = a[:mid]
righthalf = a[mid:]
merge_sort(lefthalf, cmp)
merge_sort(righthalf, cmp)
i = 0
j = 0
k = 0
while i < len(lefthalf) and j < len(righthalf):
if cmp(lefthalf[i], righthalf[j]) <= 0:
a[k] = lefthalf[i]
i = i + 1
else:
a[k] = righthalf[j]
j = j + 1
k = k + 1
while i < len(lefthalf):
a[k] = lefthalf[i]
i = i + 1
k = k + 1
while j < len(righthalf):
a[k] = righthalf[j]
j = j + 1
k = k + 1
class TreeModifier(object):
"""A utility class providing methods to alter the underlying cache in a list-like fashion.
Once all adjustments are complete, the _cache, which really is a reference to
the cache of a tree, will be sorted. Assuring it will be in a serializable state"""
__slots__ = '_cache'
def __init__(self, cache):
self._cache = cache
def _index_by_name(self, name):
""":return: index of an item with name, or -1 if not found"""
for i, t in enumerate(self._cache):
if t[2] == name:
return i
# END found item
# END for each item in cache
return -1
#{ Interface
def set_done(self):
"""Call this method once you are done modifying the tree information.
It may be called several times, but be aware that each call will cause
a sort operation
:return self:"""
merge_sort(self._cache, git_cmp)
return self
#} END interface
#{ Mutators
def add(self, sha, mode, name, force=False):
"""Add the given item to the tree. If an item with the given name already
exists, nothing will be done, but a ValueError will be raised if the
sha and mode of the existing item do not match the one you add, unless
force is True
:param sha: The 20 or 40 byte sha of the item to add
:param mode: int representing the stat compatible mode of the item
:param force: If True, an item with your name and information will overwrite
any existing item with the same name, no matter which information it has
:return: self"""
if '/' in name:
raise ValueError("Name must not contain '/' characters")
if (mode >> 12) not in Tree._map_id_to_type:
raise ValueError("Invalid object type according to mode %o" % mode)
sha = to_bin_sha(sha)
index = self._index_by_name(name)
item = (sha, mode, name)
if index == -1:
self._cache.append(item)
else:
if force:
self._cache[index] = item
else:
ex_item = self._cache[index]
if ex_item[0] != sha or ex_item[1] != mode:
raise ValueError("Item %r existed with different properties" % name)
# END handle mismatch
# END handle force
# END handle name exists
return self
def add_unchecked(self, binsha, mode, name):
"""Add the given item to the tree, its correctness is assumed, which
puts the caller into responsibility to assure the input is correct.
For more information on the parameters, see ``add``
:param binsha: 20 byte binary sha"""
self._cache.append((binsha, mode, name))
def __delitem__(self, name):
"""Deletes an item with the given name if it exists"""
index = self._index_by_name(name)
if index > -1:
del(self._cache[index])
#} END mutators
class Tree(IndexObject, diff.Diffable, util.Traversable, util.Serializable):
"""Tree objects represent an ordered list of Blobs and other Trees.
``Tree as a list``::
Access a specific blob using the
tree['filename'] notation.
You may as well access by index
blob = tree[0]
"""
type = "tree"
__slots__ = "_cache"
# actual integer ids for comparison
commit_id = 0o16 # equals stat.S_IFDIR | stat.S_IFLNK - a directory link
blob_id = 0o10
symlink_id = 0o12
tree_id = 0o04
_map_id_to_type = {
commit_id: Submodule,
blob_id: Blob,
symlink_id: Blob
# tree id added once Tree is defined
}
def __init__(self, repo, binsha, mode=tree_id << 12, path=None):
super(Tree, self).__init__(repo, binsha, mode, path)
@classmethod
def _get_intermediate_items(cls, index_object):
if index_object.type == "tree":
return tuple(index_object._iter_convert_to_object(index_object._cache))
return ()
def _set_cache_(self, attr):
if attr == "_cache":
# Set the data when we need it
ostream = self.repo.odb.stream(self.binsha)
self._cache = tree_entries_from_data(ostream.read())
else:
super(Tree, self)._set_cache_(attr)
# END handle attribute
def _iter_convert_to_object(self, iterable):
"""Iterable yields tuples of (binsha, mode, name), which will be converted
to the respective object representation"""
for binsha, mode, name in iterable:
path = join_path(self.path, name)
try:
yield self._map_id_to_type[mode >> 12](self.repo, binsha, mode, path)
except KeyError as e:
raise TypeError("Unknown mode %o found in tree data for path '%s'" % (mode, path)) from e
# END for each item
def join(self, file):
"""Find the named object in this tree's contents
:return: ``git.Blob`` or ``git.Tree`` or ``git.Submodule``
:raise KeyError: if given file or tree does not exist in tree"""
msg = "Blob or Tree named %r not found"
if '/' in file:
tree = self
item = self
tokens = file.split('/')
for i, token in enumerate(tokens):
item = tree[token]
if item.type == 'tree':
tree = item
else:
# safety assertion - blobs are at the end of the path
if i != len(tokens) - 1:
raise KeyError(msg % file)
return item
# END handle item type
# END for each token of split path
if item == self:
raise KeyError(msg % file)
return item
else:
for info in self._cache:
if info[2] == file: # [2] == name
return self._map_id_to_type[info[1] >> 12](self.repo, info[0], info[1],
join_path(self.path, info[2]))
# END for each obj
raise KeyError(msg % file)
# END handle long paths
def __div__(self, file):
"""For PY2 only"""
return self.join(file)
def __truediv__(self, file):
"""For PY3 only"""
return self.join(file)
@property
def trees(self):
""":return: list(Tree, ...) list of trees directly below this tree"""
return [i for i in self if i.type == "tree"]
@property
def blobs(self):
""":return: list(Blob, ...) list of blobs directly below this tree"""
return [i for i in self if i.type == "blob"]
@property
def cache(self):
"""
:return: An object allowing to modify the internal cache. This can be used
to change the tree's contents. When done, make sure you call ``set_done``
on the tree modifier, or serialization behaviour will be incorrect.
See the ``TreeModifier`` for more information on how to alter the cache"""
return TreeModifier(self._cache)
def traverse(self, predicate=lambda i, d: True,
prune=lambda i, d: False, depth=-1, branch_first=True,
visit_once=False, ignore_self=1):
"""For documentation, see util.Traversable.traverse
Trees are set to visit_once = False to gain more performance in the traversal"""
return super(Tree, self).traverse(predicate, prune, depth, branch_first, visit_once, ignore_self)
# List protocol
def __getslice__(self, i, j):
return list(self._iter_convert_to_object(self._cache[i:j]))
def __iter__(self):
return self._iter_convert_to_object(self._cache)
def __len__(self):
return len(self._cache)
def __getitem__(self, item):
if isinstance(item, int):
info = self._cache[item]
return self._map_id_to_type[info[1] >> 12](self.repo, info[0], info[1], join_path(self.path, info[2]))
if isinstance(item, str):
# compatibility
return self.join(item)
# END index is basestring
raise TypeError("Invalid index type: %r" % item)
def __contains__(self, item):
if isinstance(item, IndexObject):
for info in self._cache:
if item.binsha == info[0]:
return True
# END compare sha
# END for each entry
# END handle item is index object
# compatibility
# treat item as repo-relative path
path = self.path
for info in self._cache:
if item == join_path(path, info[2]):
return True
# END for each item
return False
def __reversed__(self):
return reversed(self._iter_convert_to_object(self._cache))
def _serialize(self, stream):
"""Serialize this tree into the stream. Please note that we will assume
our tree data to be in a sorted state. If this is not the case, serialization
will not generate a correct tree representation as these are assumed to be sorted
by algorithms"""
tree_to_stream(self._cache, stream.write)
return self
def _deserialize(self, stream):
self._cache = tree_entries_from_data(stream.read())
return self
# END tree
# finalize map definition
Tree._map_id_to_type[Tree.tree_id] = Tree
#

View File

@ -1,373 +0,0 @@
# util.py
# Copyright (C) 2008, 2009 Michael Trier (mtrier@gmail.com) and contributors
#
# This module is part of GitPython and is released under
# the BSD License: http://www.opensource.org/licenses/bsd-license.php
"""Module for general utility functions"""
from git.util import (
IterableList,
Actor
)
import re
from collections import deque as Deque
from string import digits
import time
import calendar
from datetime import datetime, timedelta, tzinfo
__all__ = ('get_object_type_by_name', 'parse_date', 'parse_actor_and_date',
'ProcessStreamAdapter', 'Traversable', 'altz_to_utctz_str', 'utctz_to_altz',
'verify_utctz', 'Actor', 'tzoffset', 'utc')
ZERO = timedelta(0)
#{ Functions
def mode_str_to_int(modestr):
"""
:param modestr: string like 755 or 644 or 100644 - only the last 6 chars will be used
:return:
String identifying a mode compatible to the mode methods ids of the
stat module regarding the rwx permissions for user, group and other,
special flags and file system flags, i.e. whether it is a symlink
for example."""
mode = 0
for iteration, char in enumerate(reversed(modestr[-6:])):
mode += int(char) << iteration * 3
# END for each char
return mode
def get_object_type_by_name(object_type_name):
"""
:return: type suitable to handle the given object type name.
Use the type to create new instances.
:param object_type_name: Member of TYPES
:raise ValueError: In case object_type_name is unknown"""
if object_type_name == b"commit":
from . import commit
return commit.Commit
elif object_type_name == b"tag":
from . import tag
return tag.TagObject
elif object_type_name == b"blob":
from . import blob
return blob.Blob
elif object_type_name == b"tree":
from . import tree
return tree.Tree
else:
raise ValueError("Cannot handle unknown object type: %s" % object_type_name)
def utctz_to_altz(utctz):
"""we convert utctz to the timezone in seconds, it is the format time.altzone
returns. Git stores it as UTC timezone which has the opposite sign as well,
which explains the -1 * ( that was made explicit here )
:param utctz: git utc timezone string, i.e. +0200"""
return -1 * int(float(utctz) / 100 * 3600)
def altz_to_utctz_str(altz):
"""As above, but inverses the operation, returning a string that can be used
in commit objects"""
utci = -1 * int((float(altz) / 3600) * 100)
utcs = str(abs(utci))
utcs = "0" * (4 - len(utcs)) + utcs
prefix = (utci < 0 and '-') or '+'
return prefix + utcs
def verify_utctz(offset):
""":raise ValueError: if offset is incorrect
:return: offset"""
fmt_exc = ValueError("Invalid timezone offset format: %s" % offset)
if len(offset) != 5:
raise fmt_exc
if offset[0] not in "+-":
raise fmt_exc
if offset[1] not in digits or\
offset[2] not in digits or\
offset[3] not in digits or\
offset[4] not in digits:
raise fmt_exc
# END for each char
return offset
class tzoffset(tzinfo):
def __init__(self, secs_west_of_utc, name=None):
self._offset = timedelta(seconds=-secs_west_of_utc)
self._name = name or 'fixed'
def __reduce__(self):
return tzoffset, (-self._offset.total_seconds(), self._name)
def utcoffset(self, dt):
return self._offset
def tzname(self, dt):
return self._name
def dst(self, dt):
return ZERO
utc = tzoffset(0, 'UTC')
def from_timestamp(timestamp, tz_offset):
"""Converts a timestamp + tz_offset into an aware datetime instance."""
utc_dt = datetime.fromtimestamp(timestamp, utc)
try:
local_dt = utc_dt.astimezone(tzoffset(tz_offset))
return local_dt
except ValueError:
return utc_dt
def parse_date(string_date):
"""
Parse the given date as one of the following
* aware datetime instance
* Git internal format: timestamp offset
* RFC 2822: Thu, 07 Apr 2005 22:13:13 +0200.
* ISO 8601 2005-04-07T22:13:13
The T can be a space as well
:return: Tuple(int(timestamp_UTC), int(offset)), both in seconds since epoch
:raise ValueError: If the format could not be understood
:note: Date can also be YYYY.MM.DD, MM/DD/YYYY and DD.MM.YYYY.
"""
if isinstance(string_date, datetime) and string_date.tzinfo:
offset = -int(string_date.utcoffset().total_seconds())
return int(string_date.astimezone(utc).timestamp()), offset
# git time
try:
if string_date.count(' ') == 1 and string_date.rfind(':') == -1:
timestamp, offset = string_date.split()
if timestamp.startswith('@'):
timestamp = timestamp[1:]
timestamp = int(timestamp)
return timestamp, utctz_to_altz(verify_utctz(offset))
else:
offset = "+0000" # local time by default
if string_date[-5] in '-+':
offset = verify_utctz(string_date[-5:])
string_date = string_date[:-6] # skip space as well
# END split timezone info
offset = utctz_to_altz(offset)
# now figure out the date and time portion - split time
date_formats = []
splitter = -1
if ',' in string_date:
date_formats.append("%a, %d %b %Y")
splitter = string_date.rfind(' ')
else:
# iso plus additional
date_formats.append("%Y-%m-%d")
date_formats.append("%Y.%m.%d")
date_formats.append("%m/%d/%Y")
date_formats.append("%d.%m.%Y")
splitter = string_date.rfind('T')
if splitter == -1:
splitter = string_date.rfind(' ')
# END handle 'T' and ' '
# END handle rfc or iso
assert splitter > -1
# split date and time
time_part = string_date[splitter + 1:] # skip space
date_part = string_date[:splitter]
# parse time
tstruct = time.strptime(time_part, "%H:%M:%S")
for fmt in date_formats:
try:
dtstruct = time.strptime(date_part, fmt)
utctime = calendar.timegm((dtstruct.tm_year, dtstruct.tm_mon, dtstruct.tm_mday,
tstruct.tm_hour, tstruct.tm_min, tstruct.tm_sec,
dtstruct.tm_wday, dtstruct.tm_yday, tstruct.tm_isdst))
return int(utctime), offset
except ValueError:
continue
# END exception handling
# END for each fmt
# still here ? fail
raise ValueError("no format matched")
# END handle format
except Exception as e:
raise ValueError("Unsupported date format: %s" % string_date) from e
# END handle exceptions
# precompiled regex
_re_actor_epoch = re.compile(r'^.+? (.*) (\d+) ([+-]\d+).*$')
_re_only_actor = re.compile(r'^.+? (.*)$')
def parse_actor_and_date(line):
"""Parse out the actor (author or committer) info from a line like::
author Tom Preston-Werner <tom@mojombo.com> 1191999972 -0700
:return: [Actor, int_seconds_since_epoch, int_timezone_offset]"""
actor, epoch, offset = '', 0, 0
m = _re_actor_epoch.search(line)
if m:
actor, epoch, offset = m.groups()
else:
m = _re_only_actor.search(line)
actor = m.group(1) if m else line or ''
return (Actor._from_string(actor), int(epoch), utctz_to_altz(offset))
#} END functions
#{ Classes
class ProcessStreamAdapter(object):
"""Class wireing all calls to the contained Process instance.
Use this type to hide the underlying process to provide access only to a specified
stream. The process is usually wrapped into an AutoInterrupt class to kill
it if the instance goes out of scope."""
__slots__ = ("_proc", "_stream")
def __init__(self, process, stream_name):
self._proc = process
self._stream = getattr(process, stream_name)
def __getattr__(self, attr):
return getattr(self._stream, attr)
class Traversable(object):
"""Simple interface to perform depth-first or breadth-first traversals
into one direction.
Subclasses only need to implement one function.
Instances of the Subclass must be hashable"""
__slots__ = ()
@classmethod
def _get_intermediate_items(cls, item):
"""
Returns:
List of items connected to the given item.
Must be implemented in subclass
"""
raise NotImplementedError("To be implemented in subclass")
def list_traverse(self, *args, **kwargs):
"""
:return: IterableList with the results of the traversal as produced by
traverse()"""
out = IterableList(self._id_attribute_)
out.extend(self.traverse(*args, **kwargs))
return out
def traverse(self, predicate=lambda i, d: True,
prune=lambda i, d: False, depth=-1, branch_first=True,
visit_once=True, ignore_self=1, as_edge=False):
""":return: iterator yielding of items found when traversing self
:param predicate: f(i,d) returns False if item i at depth d should not be included in the result
:param prune:
f(i,d) return True if the search should stop at item i at depth d.
Item i will not be returned.
:param depth:
define at which level the iteration should not go deeper
if -1, there is no limit
if 0, you would effectively only get self, the root of the iteration
i.e. if 1, you would only get the first level of predecessors/successors
:param branch_first:
if True, items will be returned branch first, otherwise depth first
:param visit_once:
if True, items will only be returned once, although they might be encountered
several times. Loops are prevented that way.
:param ignore_self:
if True, self will be ignored and automatically pruned from
the result. Otherwise it will be the first item to be returned.
If as_edge is True, the source of the first edge is None
:param as_edge:
if True, return a pair of items, first being the source, second the
destination, i.e. tuple(src, dest) with the edge spanning from
source to destination"""
visited = set()
stack = Deque()
stack.append((0, self, None)) # self is always depth level 0
def addToStack(stack, item, branch_first, depth):
lst = self._get_intermediate_items(item)
if not lst:
return
if branch_first:
stack.extendleft((depth, i, item) for i in lst)
else:
reviter = ((depth, lst[i], item) for i in range(len(lst) - 1, -1, -1))
stack.extend(reviter)
# END addToStack local method
while stack:
d, item, src = stack.pop() # depth of item, item, item_source
if visit_once and item in visited:
continue
if visit_once:
visited.add(item)
rval = (as_edge and (src, item)) or item
if prune(rval, d):
continue
skipStartItem = ignore_self and (item is self)
if not skipStartItem and predicate(rval, d):
yield rval
# only continue to next level if this is appropriate !
nd = d + 1
if depth > -1 and nd > depth:
continue
addToStack(stack, item, branch_first, nd)
# END for each item on work stack
class Serializable(object):
"""Defines methods to serialize and deserialize objects from and into a data stream"""
__slots__ = ()
def _serialize(self, stream):
"""Serialize the data of this object into the given data stream
:note: a serialized object would ``_deserialize`` into the same object
:param stream: a file-like object
:return: self"""
raise NotImplementedError("To be implemented in subclass")
def _deserialize(self, stream):
"""Deserialize all information regarding this object from the stream
:param stream: a file-like object
:return: self"""
raise NotImplementedError("To be implemented in subclass")

View File

@ -1,10 +0,0 @@
# flake8: noqa
from __future__ import absolute_import
# import all modules in order, fix the names they require
from .symbolic import *
from .reference import *
from .head import *
from .tag import *
from .remote import *
from .log import *

View File

@ -1,246 +0,0 @@
from git.config import SectionConstraint
from git.util import join_path
from git.exc import GitCommandError
from .symbolic import SymbolicReference
from .reference import Reference
__all__ = ["HEAD", "Head"]
def strip_quotes(string):
if string.startswith('"') and string.endswith('"'):
return string[1:-1]
return string
class HEAD(SymbolicReference):
"""Special case of a Symbolic Reference as it represents the repository's
HEAD reference."""
_HEAD_NAME = 'HEAD'
_ORIG_HEAD_NAME = 'ORIG_HEAD'
__slots__ = ()
def __init__(self, repo, path=_HEAD_NAME):
if path != self._HEAD_NAME:
raise ValueError("HEAD instance must point to %r, got %r" % (self._HEAD_NAME, path))
super(HEAD, self).__init__(repo, path)
def orig_head(self):
"""
:return: SymbolicReference pointing at the ORIG_HEAD, which is maintained
to contain the previous value of HEAD"""
return SymbolicReference(self.repo, self._ORIG_HEAD_NAME)
def reset(self, commit='HEAD', index=True, working_tree=False,
paths=None, **kwargs):
"""Reset our HEAD to the given commit optionally synchronizing
the index and working tree. The reference we refer to will be set to
commit as well.
:param commit:
Commit object, Reference Object or string identifying a revision we
should reset HEAD to.
:param index:
If True, the index will be set to match the given commit. Otherwise
it will not be touched.
:param working_tree:
If True, the working tree will be forcefully adjusted to match the given
commit, possibly overwriting uncommitted changes without warning.
If working_tree is True, index must be true as well
:param paths:
Single path or list of paths relative to the git root directory
that are to be reset. This allows to partially reset individual files.
:param kwargs:
Additional arguments passed to git-reset.
:return: self"""
mode = "--soft"
if index:
mode = "--mixed"
# it appears, some git-versions declare mixed and paths deprecated
# see http://github.com/Byron/GitPython/issues#issue/2
if paths:
mode = None
# END special case
# END handle index
if working_tree:
mode = "--hard"
if not index:
raise ValueError("Cannot reset the working tree if the index is not reset as well")
# END working tree handling
try:
self.repo.git.reset(mode, commit, '--', paths, **kwargs)
except GitCommandError as e:
# git nowadays may use 1 as status to indicate there are still unstaged
# modifications after the reset
if e.status != 1:
raise
# END handle exception
return self
class Head(Reference):
"""A Head is a named reference to a Commit. Every Head instance contains a name
and a Commit object.
Examples::
>>> repo = Repo("/path/to/repo")
>>> head = repo.heads[0]
>>> head.name
'master'
>>> head.commit
<git.Commit "1c09f116cbc2cb4100fb6935bb162daa4723f455">
>>> head.commit.hexsha
'1c09f116cbc2cb4100fb6935bb162daa4723f455'"""
_common_path_default = "refs/heads"
k_config_remote = "remote"
k_config_remote_ref = "merge" # branch to merge from remote
@classmethod
def delete(cls, repo, *heads, **kwargs):
"""Delete the given heads
:param force:
If True, the heads will be deleted even if they are not yet merged into
the main development stream.
Default False"""
force = kwargs.get("force", False)
flag = "-d"
if force:
flag = "-D"
repo.git.branch(flag, *heads)
def set_tracking_branch(self, remote_reference):
"""
Configure this branch to track the given remote reference. This will alter
this branch's configuration accordingly.
:param remote_reference: The remote reference to track or None to untrack
any references
:return: self"""
from .remote import RemoteReference
if remote_reference is not None and not isinstance(remote_reference, RemoteReference):
raise ValueError("Incorrect parameter type: %r" % remote_reference)
# END handle type
with self.config_writer() as writer:
if remote_reference is None:
writer.remove_option(self.k_config_remote)
writer.remove_option(self.k_config_remote_ref)
if len(writer.options()) == 0:
writer.remove_section()
else:
writer.set_value(self.k_config_remote, remote_reference.remote_name)
writer.set_value(self.k_config_remote_ref, Head.to_full_path(remote_reference.remote_head))
return self
def tracking_branch(self):
"""
:return: The remote_reference we are tracking, or None if we are
not a tracking branch"""
from .remote import RemoteReference
reader = self.config_reader()
if reader.has_option(self.k_config_remote) and reader.has_option(self.k_config_remote_ref):
ref = Head(self.repo, Head.to_full_path(strip_quotes(reader.get_value(self.k_config_remote_ref))))
remote_refpath = RemoteReference.to_full_path(join_path(reader.get_value(self.k_config_remote), ref.name))
return RemoteReference(self.repo, remote_refpath)
# END handle have tracking branch
# we are not a tracking branch
return None
def rename(self, new_path, force=False):
"""Rename self to a new path
:param new_path:
Either a simple name or a path, i.e. new_name or features/new_name.
The prefix refs/heads is implied
:param force:
If True, the rename will succeed even if a head with the target name
already exists.
:return: self
:note: respects the ref log as git commands are used"""
flag = "-m"
if force:
flag = "-M"
self.repo.git.branch(flag, self, new_path)
self.path = "%s/%s" % (self._common_path_default, new_path)
return self
def checkout(self, force=False, **kwargs):
"""Checkout this head by setting the HEAD to this reference, by updating the index
to reflect the tree we point to and by updating the working tree to reflect
the latest index.
The command will fail if changed working tree files would be overwritten.
:param force:
If True, changes to the index and the working tree will be discarded.
If False, GitCommandError will be raised in that situation.
:param kwargs:
Additional keyword arguments to be passed to git checkout, i.e.
b='new_branch' to create a new branch at the given spot.
:return:
The active branch after the checkout operation, usually self unless
a new branch has been created.
If there is no active branch, as the HEAD is now detached, the HEAD
reference will be returned instead.
:note:
By default it is only allowed to checkout heads - everything else
will leave the HEAD detached which is allowed and possible, but remains
a special state that some tools might not be able to handle."""
kwargs['f'] = force
if kwargs['f'] is False:
kwargs.pop('f')
self.repo.git.checkout(self, **kwargs)
if self.repo.head.is_detached:
return self.repo.head
return self.repo.active_branch
#{ Configuration
def _config_parser(self, read_only):
if read_only:
parser = self.repo.config_reader()
else:
parser = self.repo.config_writer()
# END handle parser instance
return SectionConstraint(parser, 'branch "%s"' % self.name)
def config_reader(self):
"""
:return: A configuration parser instance constrained to only read
this instance's values"""
return self._config_parser(read_only=True)
def config_writer(self):
"""
:return: A configuration writer instance with read-and write access
to options of this head"""
return self._config_parser(read_only=False)
#} END configuration

View File

@ -1,305 +0,0 @@
import re
import time
from git.compat import defenc
from git.objects.util import (
parse_date,
Serializable,
altz_to_utctz_str,
)
from git.util import (
Actor,
LockedFD,
LockFile,
assure_directory_exists,
to_native_path,
bin_to_hex,
file_contents_ro_filepath
)
import os.path as osp
__all__ = ["RefLog", "RefLogEntry"]
class RefLogEntry(tuple):
"""Named tuple allowing easy access to the revlog data fields"""
_re_hexsha_only = re.compile('^[0-9A-Fa-f]{40}$')
__slots__ = ()
def __repr__(self):
"""Representation of ourselves in git reflog format"""
return self.format()
def format(self):
""":return: a string suitable to be placed in a reflog file"""
act = self.actor
time = self.time
return "{} {} {} <{}> {!s} {}\t{}\n".format(self.oldhexsha,
self.newhexsha,
act.name,
act.email,
time[0],
altz_to_utctz_str(time[1]),
self.message)
@property
def oldhexsha(self):
"""The hexsha to the commit the ref pointed to before the change"""
return self[0]
@property
def newhexsha(self):
"""The hexsha to the commit the ref now points to, after the change"""
return self[1]
@property
def actor(self):
"""Actor instance, providing access"""
return self[2]
@property
def time(self):
"""time as tuple:
* [0] = int(time)
* [1] = int(timezone_offset) in time.altzone format """
return self[3]
@property
def message(self):
"""Message describing the operation that acted on the reference"""
return self[4]
@classmethod
def new(cls, oldhexsha, newhexsha, actor, time, tz_offset, message): # skipcq: PYL-W0621
""":return: New instance of a RefLogEntry"""
if not isinstance(actor, Actor):
raise ValueError("Need actor instance, got %s" % actor)
# END check types
return RefLogEntry((oldhexsha, newhexsha, actor, (time, tz_offset), message))
@classmethod
def from_line(cls, line):
""":return: New RefLogEntry instance from the given revlog line.
:param line: line bytes without trailing newline
:raise ValueError: If line could not be parsed"""
line = line.decode(defenc)
fields = line.split('\t', 1)
if len(fields) == 1:
info, msg = fields[0], None
elif len(fields) == 2:
info, msg = fields
else:
raise ValueError("Line must have up to two TAB-separated fields."
" Got %s" % repr(line))
# END handle first split
oldhexsha = info[:40]
newhexsha = info[41:81]
for hexsha in (oldhexsha, newhexsha):
if not cls._re_hexsha_only.match(hexsha):
raise ValueError("Invalid hexsha: %r" % (hexsha,))
# END if hexsha re doesn't match
# END for each hexsha
email_end = info.find('>', 82)
if email_end == -1:
raise ValueError("Missing token: >")
# END handle missing end brace
actor = Actor._from_string(info[82:email_end + 1])
time, tz_offset = parse_date(info[email_end + 2:]) # skipcq: PYL-W0621
return RefLogEntry((oldhexsha, newhexsha, actor, (time, tz_offset), msg))
class RefLog(list, Serializable):
"""A reflog contains reflog entries, each of which defines a certain state
of the head in question. Custom query methods allow to retrieve log entries
by date or by other criteria.
Reflog entries are ordered, the first added entry is first in the list, the last
entry, i.e. the last change of the head or reference, is last in the list."""
__slots__ = ('_path', )
def __new__(cls, filepath=None):
inst = super(RefLog, cls).__new__(cls)
return inst
def __init__(self, filepath=None):
"""Initialize this instance with an optional filepath, from which we will
initialize our data. The path is also used to write changes back using
the write() method"""
self._path = filepath
if filepath is not None:
self._read_from_file()
# END handle filepath
def _read_from_file(self):
try:
fmap = file_contents_ro_filepath(self._path, stream=True, allow_mmap=True)
except OSError:
# it is possible and allowed that the file doesn't exist !
return
# END handle invalid log
try:
self._deserialize(fmap)
finally:
fmap.close()
# END handle closing of handle
#{ Interface
@classmethod
def from_file(cls, filepath):
"""
:return: a new RefLog instance containing all entries from the reflog
at the given filepath
:param filepath: path to reflog
:raise ValueError: If the file could not be read or was corrupted in some way"""
return cls(filepath)
@classmethod
def path(cls, ref):
"""
:return: string to absolute path at which the reflog of the given ref
instance would be found. The path is not guaranteed to point to a valid
file though.
:param ref: SymbolicReference instance"""
return osp.join(ref.repo.git_dir, "logs", to_native_path(ref.path))
@classmethod
def iter_entries(cls, stream):
"""
:return: Iterator yielding RefLogEntry instances, one for each line read
sfrom the given stream.
:param stream: file-like object containing the revlog in its native format
or basestring instance pointing to a file to read"""
new_entry = RefLogEntry.from_line
if isinstance(stream, str):
stream = file_contents_ro_filepath(stream)
# END handle stream type
while True:
line = stream.readline()
if not line:
return
yield new_entry(line.strip())
# END endless loop
stream.close()
@classmethod
def entry_at(cls, filepath, index):
""":return: RefLogEntry at the given index
:param filepath: full path to the index file from which to read the entry
:param index: python list compatible index, i.e. it may be negative to
specify an entry counted from the end of the list
:raise IndexError: If the entry didn't exist
.. note:: This method is faster as it only parses the entry at index, skipping
all other lines. Nonetheless, the whole file has to be read if
the index is negative
"""
with open(filepath, 'rb') as fp:
if index < 0:
return RefLogEntry.from_line(fp.readlines()[index].strip())
# read until index is reached
for i in range(index + 1):
line = fp.readline()
if not line:
break
# END abort on eof
# END handle runup
if i != index or not line: # skipcq:PYL-W0631
raise IndexError
# END handle exception
return RefLogEntry.from_line(line.strip())
# END handle index
def to_file(self, filepath):
"""Write the contents of the reflog instance to a file at the given filepath.
:param filepath: path to file, parent directories are assumed to exist"""
lfd = LockedFD(filepath)
assure_directory_exists(filepath, is_file=True)
fp = lfd.open(write=True, stream=True)
try:
self._serialize(fp)
lfd.commit()
except Exception:
# on failure it rolls back automatically, but we make it clear
lfd.rollback()
raise
# END handle change
@classmethod
def append_entry(cls, config_reader, filepath, oldbinsha, newbinsha, message):
"""Append a new log entry to the revlog at filepath.
:param config_reader: configuration reader of the repository - used to obtain
user information. May also be an Actor instance identifying the committer directly.
May also be None
:param filepath: full path to the log file
:param oldbinsha: binary sha of the previous commit
:param newbinsha: binary sha of the current commit
:param message: message describing the change to the reference
:param write: If True, the changes will be written right away. Otherwise
the change will not be written
:return: RefLogEntry objects which was appended to the log
:note: As we are append-only, concurrent access is not a problem as we
do not interfere with readers."""
if len(oldbinsha) != 20 or len(newbinsha) != 20:
raise ValueError("Shas need to be given in binary format")
# END handle sha type
assure_directory_exists(filepath, is_file=True)
first_line = message.split('\n')[0]
committer = isinstance(config_reader, Actor) and config_reader or Actor.committer(config_reader)
entry = RefLogEntry((
bin_to_hex(oldbinsha).decode('ascii'),
bin_to_hex(newbinsha).decode('ascii'),
committer, (int(time.time()), time.altzone), first_line
))
lf = LockFile(filepath)
lf._obtain_lock_or_raise()
fd = open(filepath, 'ab')
try:
fd.write(entry.format().encode(defenc))
finally:
fd.close()
lf._release_lock()
# END handle write operation
return entry
def write(self):
"""Write this instance's data to the file we are originating from
:return: self"""
if self._path is None:
raise ValueError("Instance was not initialized with a path, use to_file(...) instead")
# END assert path
self.to_file(self._path)
return self
#} END interface
#{ Serializable Interface
def _serialize(self, stream):
write = stream.write
# write all entries
for e in self:
write(e.format().encode(defenc))
# END for each entry
def _deserialize(self, stream):
self.extend(self.iter_entries(stream))
#} END serializable interface

View File

@ -1,126 +0,0 @@
from git.util import (
LazyMixin,
Iterable,
)
from .symbolic import SymbolicReference
__all__ = ["Reference"]
#{ Utilities
def require_remote_ref_path(func):
"""A decorator raising a TypeError if we are not a valid remote, based on the path"""
def wrapper(self, *args):
if not self.is_remote():
raise ValueError("ref path does not point to a remote reference: %s" % self.path)
return func(self, *args)
# END wrapper
wrapper.__name__ = func.__name__
return wrapper
#}END utilities
class Reference(SymbolicReference, LazyMixin, Iterable):
"""Represents a named reference to any object. Subclasses may apply restrictions though,
i.e. Heads can only point to commits."""
__slots__ = ()
_points_to_commits_only = False
_resolve_ref_on_create = True
_common_path_default = "refs"
def __init__(self, repo, path, check_path=True):
"""Initialize this instance
:param repo: Our parent repository
:param path:
Path relative to the .git/ directory pointing to the ref in question, i.e.
refs/heads/master
:param check_path: if False, you can provide any path. Otherwise the path must start with the
default path prefix of this type."""
if check_path and not path.startswith(self._common_path_default + '/'):
raise ValueError("Cannot instantiate %r from path %s" % (self.__class__.__name__, path))
super(Reference, self).__init__(repo, path)
def __str__(self):
return self.name
#{ Interface
def set_object(self, object, logmsg=None): # @ReservedAssignment
"""Special version which checks if the head-log needs an update as well
:return: self"""
oldbinsha = None
if logmsg is not None:
head = self.repo.head
if not head.is_detached and head.ref == self:
oldbinsha = self.commit.binsha
# END handle commit retrieval
# END handle message is set
super(Reference, self).set_object(object, logmsg)
if oldbinsha is not None:
# /* from refs.c in git-source
# * Special hack: If a branch is updated directly and HEAD
# * points to it (may happen on the remote side of a push
# * for example) then logically the HEAD reflog should be
# * updated too.
# * A generic solution implies reverse symref information,
# * but finding all symrefs pointing to the given branch
# * would be rather costly for this rare event (the direct
# * update of a branch) to be worth it. So let's cheat and
# * check with HEAD only which should cover 99% of all usage
# * scenarios (even 100% of the default ones).
# */
self.repo.head.log_append(oldbinsha, logmsg)
# END check if the head
return self
# NOTE: Don't have to overwrite properties as the will only work without a the log
@property
def name(self):
""":return: (shortest) Name of this reference - it may contain path components"""
# first two path tokens are can be removed as they are
# refs/heads or refs/tags or refs/remotes
tokens = self.path.split('/')
if len(tokens) < 3:
return self.path # could be refs/HEAD
return '/'.join(tokens[2:])
@classmethod
def iter_items(cls, repo, common_path=None):
"""Equivalent to SymbolicReference.iter_items, but will return non-detached
references as well."""
return cls._iter_items(repo, common_path)
#}END interface
#{ Remote Interface
@property
@require_remote_ref_path
def remote_name(self):
"""
:return:
Name of the remote we are a reference of, such as 'origin' for a reference
named 'origin/master'"""
tokens = self.path.split('/')
# /refs/remotes/<remote name>/<branch_name>
return tokens[2]
@property
@require_remote_ref_path
def remote_head(self):
""":return: Name of the remote head itself, i.e. master.
:note: The returned name is usually not qualified enough to uniquely identify
a branch"""
tokens = self.path.split('/')
return '/'.join(tokens[3:])
#} END remote interface

View File

@ -1,52 +0,0 @@
import os
from git.util import join_path
import os.path as osp
from .head import Head
__all__ = ["RemoteReference"]
class RemoteReference(Head):
"""Represents a reference pointing to a remote head."""
_common_path_default = Head._remote_common_path_default
@classmethod
def iter_items(cls, repo, common_path=None, remote=None):
"""Iterate remote references, and if given, constrain them to the given remote"""
common_path = common_path or cls._common_path_default
if remote is not None:
common_path = join_path(common_path, str(remote))
# END handle remote constraint
return super(RemoteReference, cls).iter_items(repo, common_path)
@classmethod
def delete(cls, repo, *refs, **kwargs):
"""Delete the given remote references
:note:
kwargs are given for comparability with the base class method as we
should not narrow the signature."""
repo.git.branch("-d", "-r", *refs)
# the official deletion method will ignore remote symbolic refs - these
# are generally ignored in the refs/ folder. We don't though
# and delete remainders manually
for ref in refs:
try:
os.remove(osp.join(repo.common_dir, ref.path))
except OSError:
pass
try:
os.remove(osp.join(repo.git_dir, ref.path))
except OSError:
pass
# END for each ref
@classmethod
def create(cls, *args, **kwargs):
"""Used to disable this method"""
raise TypeError("Cannot explicitly create remote references")

View File

@ -1,678 +0,0 @@
import os
from git.compat import defenc
from git.objects import Object, Commit
from git.util import (
join_path,
join_path_native,
to_native_path_linux,
assure_directory_exists,
hex_to_bin,
LockedFD
)
from gitdb.exc import (
BadObject,
BadName
)
import os.path as osp
from .log import RefLog
__all__ = ["SymbolicReference"]
def _git_dir(repo, path):
""" Find the git dir that's appropriate for the path"""
name = "%s" % (path,)
if name in ['HEAD', 'ORIG_HEAD', 'FETCH_HEAD', 'index', 'logs']:
return repo.git_dir
return repo.common_dir
class SymbolicReference(object):
"""Represents a special case of a reference such that this reference is symbolic.
It does not point to a specific commit, but to another Head, which itself
specifies a commit.
A typical example for a symbolic reference is HEAD."""
__slots__ = ("repo", "path")
_resolve_ref_on_create = False
_points_to_commits_only = True
_common_path_default = ""
_remote_common_path_default = "refs/remotes"
_id_attribute_ = "name"
def __init__(self, repo, path):
self.repo = repo
self.path = path
def __str__(self):
return self.path
def __repr__(self):
return '<git.%s "%s">' % (self.__class__.__name__, self.path)
def __eq__(self, other):
if hasattr(other, 'path'):
return self.path == other.path
return False
def __ne__(self, other):
return not (self == other)
def __hash__(self):
return hash(self.path)
@property
def name(self):
"""
:return:
In case of symbolic references, the shortest assumable name
is the path itself."""
return self.path
@property
def abspath(self):
return join_path_native(_git_dir(self.repo, self.path), self.path)
@classmethod
def _get_packed_refs_path(cls, repo):
return osp.join(repo.common_dir, 'packed-refs')
@classmethod
def _iter_packed_refs(cls, repo):
"""Returns an iterator yielding pairs of sha1/path pairs (as bytes) for the corresponding refs.
:note: The packed refs file will be kept open as long as we iterate"""
try:
with open(cls._get_packed_refs_path(repo), 'rt') as fp:
for line in fp:
line = line.strip()
if not line:
continue
if line.startswith('#'):
# "# pack-refs with: peeled fully-peeled sorted"
# the git source code shows "peeled",
# "fully-peeled" and "sorted" as the keywords
# that can go on this line, as per comments in git file
# refs/packed-backend.c
# I looked at master on 2017-10-11,
# commit 111ef79afe, after tag v2.15.0-rc1
# from repo https://github.com/git/git.git
if line.startswith('# pack-refs with:') and 'peeled' not in line:
raise TypeError("PackingType of packed-Refs not understood: %r" % line)
# END abort if we do not understand the packing scheme
continue
# END parse comment
# skip dereferenced tag object entries - previous line was actual
# tag reference for it
if line[0] == '^':
continue
yield tuple(line.split(' ', 1))
# END for each line
except (OSError, IOError):
return
# END no packed-refs file handling
# NOTE: Had try-finally block around here to close the fp,
# but some python version wouldn't allow yields within that.
# I believe files are closing themselves on destruction, so it is
# alright.
@classmethod
def dereference_recursive(cls, repo, ref_path):
"""
:return: hexsha stored in the reference at the given ref_path, recursively dereferencing all
intermediate references as required
:param repo: the repository containing the reference at ref_path"""
while True:
hexsha, ref_path = cls._get_ref_info(repo, ref_path)
if hexsha is not None:
return hexsha
# END recursive dereferencing
@classmethod
def _get_ref_info_helper(cls, repo, ref_path):
"""Return: (str(sha), str(target_ref_path)) if available, the sha the file at
rela_path points to, or None. target_ref_path is the reference we
point to, or None"""
tokens = None
repodir = _git_dir(repo, ref_path)
try:
with open(osp.join(repodir, ref_path), 'rt', encoding='UTF-8') as fp:
value = fp.read().rstrip()
# Don't only split on spaces, but on whitespace, which allows to parse lines like
# 60b64ef992065e2600bfef6187a97f92398a9144 branch 'master' of git-server:/path/to/repo
tokens = value.split()
assert(len(tokens) != 0)
except (OSError, IOError):
# Probably we are just packed, find our entry in the packed refs file
# NOTE: We are not a symbolic ref if we are in a packed file, as these
# are excluded explicitly
for sha, path in cls._iter_packed_refs(repo):
if path != ref_path:
continue
# sha will be used
tokens = sha, path
break
# END for each packed ref
# END handle packed refs
if tokens is None:
raise ValueError("Reference at %r does not exist" % ref_path)
# is it a reference ?
if tokens[0] == 'ref:':
return (None, tokens[1])
# its a commit
if repo.re_hexsha_only.match(tokens[0]):
return (tokens[0], None)
raise ValueError("Failed to parse reference information from %r" % ref_path)
@classmethod
def _get_ref_info(cls, repo, ref_path):
"""Return: (str(sha), str(target_ref_path)) if available, the sha the file at
rela_path points to, or None. target_ref_path is the reference we
point to, or None"""
return cls._get_ref_info_helper(repo, ref_path)
def _get_object(self):
"""
:return:
The object our ref currently refers to. Refs can be cached, they will
always point to the actual object as it gets re-created on each query"""
# have to be dynamic here as we may be a tag which can point to anything
# Our path will be resolved to the hexsha which will be used accordingly
return Object.new_from_sha(self.repo, hex_to_bin(self.dereference_recursive(self.repo, self.path)))
def _get_commit(self):
"""
:return:
Commit object we point to, works for detached and non-detached
SymbolicReferences. The symbolic reference will be dereferenced recursively."""
obj = self._get_object()
if obj.type == 'tag':
obj = obj.object
# END dereference tag
if obj.type != Commit.type:
raise TypeError("Symbolic Reference pointed to object %r, commit was required" % obj)
# END handle type
return obj
def set_commit(self, commit, logmsg=None):
"""As set_object, but restricts the type of object to be a Commit
:raise ValueError: If commit is not a Commit object or doesn't point to
a commit
:return: self"""
# check the type - assume the best if it is a base-string
invalid_type = False
if isinstance(commit, Object):
invalid_type = commit.type != Commit.type
elif isinstance(commit, SymbolicReference):
invalid_type = commit.object.type != Commit.type
else:
try:
invalid_type = self.repo.rev_parse(commit).type != Commit.type
except (BadObject, BadName) as e:
raise ValueError("Invalid object: %s" % commit) from e
# END handle exception
# END verify type
if invalid_type:
raise ValueError("Need commit, got %r" % commit)
# END handle raise
# we leave strings to the rev-parse method below
self.set_object(commit, logmsg)
return self
def set_object(self, object, logmsg=None): # @ReservedAssignment
"""Set the object we point to, possibly dereference our symbolic reference first.
If the reference does not exist, it will be created
:param object: a refspec, a SymbolicReference or an Object instance. SymbolicReferences
will be dereferenced beforehand to obtain the object they point to
:param logmsg: If not None, the message will be used in the reflog entry to be
written. Otherwise the reflog is not altered
:note: plain SymbolicReferences may not actually point to objects by convention
:return: self"""
if isinstance(object, SymbolicReference):
object = object.object # @ReservedAssignment
# END resolve references
is_detached = True
try:
is_detached = self.is_detached
except ValueError:
pass
# END handle non-existing ones
if is_detached:
return self.set_reference(object, logmsg)
# set the commit on our reference
return self._get_reference().set_object(object, logmsg)
commit = property(_get_commit, set_commit, doc="Query or set commits directly")
object = property(_get_object, set_object, doc="Return the object our ref currently refers to")
def _get_reference(self):
""":return: Reference Object we point to
:raise TypeError: If this symbolic reference is detached, hence it doesn't point
to a reference, but to a commit"""
sha, target_ref_path = self._get_ref_info(self.repo, self.path)
if target_ref_path is None:
raise TypeError("%s is a detached symbolic reference as it points to %r" % (self, sha))
return self.from_path(self.repo, target_ref_path)
def set_reference(self, ref, logmsg=None):
"""Set ourselves to the given ref. It will stay a symbol if the ref is a Reference.
Otherwise an Object, given as Object instance or refspec, is assumed and if valid,
will be set which effectively detaches the refererence if it was a purely
symbolic one.
:param ref: SymbolicReference instance, Object instance or refspec string
Only if the ref is a SymbolicRef instance, we will point to it. Everything
else is dereferenced to obtain the actual object.
:param logmsg: If set to a string, the message will be used in the reflog.
Otherwise, a reflog entry is not written for the changed reference.
The previous commit of the entry will be the commit we point to now.
See also: log_append()
:return: self
:note: This symbolic reference will not be dereferenced. For that, see
``set_object(...)``"""
write_value = None
obj = None
if isinstance(ref, SymbolicReference):
write_value = "ref: %s" % ref.path
elif isinstance(ref, Object):
obj = ref
write_value = ref.hexsha
elif isinstance(ref, str):
try:
obj = self.repo.rev_parse(ref + "^{}") # optionally deref tags
write_value = obj.hexsha
except (BadObject, BadName) as e:
raise ValueError("Could not extract object from %s" % ref) from e
# END end try string
else:
raise ValueError("Unrecognized Value: %r" % ref)
# END try commit attribute
# typecheck
if obj is not None and self._points_to_commits_only and obj.type != Commit.type:
raise TypeError("Require commit, got %r" % obj)
# END verify type
oldbinsha = None
if logmsg is not None:
try:
oldbinsha = self.commit.binsha
except ValueError:
oldbinsha = Commit.NULL_BIN_SHA
# END handle non-existing
# END retrieve old hexsha
fpath = self.abspath
assure_directory_exists(fpath, is_file=True)
lfd = LockedFD(fpath)
fd = lfd.open(write=True, stream=True)
ok = True
try:
fd.write(write_value.encode('ascii') + b'\n')
lfd.commit()
ok = True
finally:
if not ok:
lfd.rollback()
# Adjust the reflog
if logmsg is not None:
self.log_append(oldbinsha, logmsg)
return self
# aliased reference
reference = property(_get_reference, set_reference, doc="Returns the Reference we point to")
ref = reference
def is_valid(self):
"""
:return:
True if the reference is valid, hence it can be read and points to
a valid object or reference."""
try:
self.object
except (OSError, ValueError):
return False
else:
return True
@property
def is_detached(self):
"""
:return:
True if we are a detached reference, hence we point to a specific commit
instead to another reference"""
try:
self.ref
return False
except TypeError:
return True
def log(self):
"""
:return: RefLog for this reference. Its last entry reflects the latest change
applied to this reference
.. note:: As the log is parsed every time, its recommended to cache it for use
instead of calling this method repeatedly. It should be considered read-only."""
return RefLog.from_file(RefLog.path(self))
def log_append(self, oldbinsha, message, newbinsha=None):
"""Append a logentry to the logfile of this ref
:param oldbinsha: binary sha this ref used to point to
:param message: A message describing the change
:param newbinsha: The sha the ref points to now. If None, our current commit sha
will be used
:return: added RefLogEntry instance"""
# NOTE: we use the committer of the currently active commit - this should be
# correct to allow overriding the committer on a per-commit level.
# See https://github.com/gitpython-developers/GitPython/pull/146
try:
committer_or_reader = self.commit.committer
except ValueError:
committer_or_reader = self.repo.config_reader()
# end handle newly cloned repositories
return RefLog.append_entry(committer_or_reader, RefLog.path(self), oldbinsha,
(newbinsha is None and self.commit.binsha) or newbinsha,
message)
def log_entry(self, index):
""":return: RefLogEntry at the given index
:param index: python list compatible positive or negative index
.. note:: This method must read part of the reflog during execution, hence
it should be used sparringly, or only if you need just one index.
In that case, it will be faster than the ``log()`` method"""
return RefLog.entry_at(RefLog.path(self), index)
@classmethod
def to_full_path(cls, path):
"""
:return: string with a full repository-relative path which can be used to initialize
a Reference instance, for instance by using ``Reference.from_path``"""
if isinstance(path, SymbolicReference):
path = path.path
full_ref_path = path
if not cls._common_path_default:
return full_ref_path
if not path.startswith(cls._common_path_default + "/"):
full_ref_path = '%s/%s' % (cls._common_path_default, path)
return full_ref_path
@classmethod
def delete(cls, repo, path):
"""Delete the reference at the given path
:param repo:
Repository to delete the reference from
:param path:
Short or full path pointing to the reference, i.e. refs/myreference
or just "myreference", hence 'refs/' is implied.
Alternatively the symbolic reference to be deleted"""
full_ref_path = cls.to_full_path(path)
abs_path = osp.join(repo.common_dir, full_ref_path)
if osp.exists(abs_path):
os.remove(abs_path)
else:
# check packed refs
pack_file_path = cls._get_packed_refs_path(repo)
try:
with open(pack_file_path, 'rb') as reader:
new_lines = []
made_change = False
dropped_last_line = False
for line in reader:
line = line.decode(defenc)
_, _, line_ref = line.partition(' ')
line_ref = line_ref.strip()
# keep line if it is a comment or if the ref to delete is not
# in the line
# If we deleted the last line and this one is a tag-reference object,
# we drop it as well
if (line.startswith('#') or full_ref_path != line_ref) and \
(not dropped_last_line or dropped_last_line and not line.startswith('^')):
new_lines.append(line)
dropped_last_line = False
continue
# END skip comments and lines without our path
# drop this line
made_change = True
dropped_last_line = True
# write the new lines
if made_change:
# write-binary is required, otherwise windows will
# open the file in text mode and change LF to CRLF !
with open(pack_file_path, 'wb') as fd:
fd.writelines(line.encode(defenc) for line in new_lines)
except (OSError, IOError):
pass # it didn't exist at all
# delete the reflog
reflog_path = RefLog.path(cls(repo, full_ref_path))
if osp.isfile(reflog_path):
os.remove(reflog_path)
# END remove reflog
@classmethod
def _create(cls, repo, path, resolve, reference, force, logmsg=None):
"""internal method used to create a new symbolic reference.
If resolve is False, the reference will be taken as is, creating
a proper symbolic reference. Otherwise it will be resolved to the
corresponding object and a detached symbolic reference will be created
instead"""
git_dir = _git_dir(repo, path)
full_ref_path = cls.to_full_path(path)
abs_ref_path = osp.join(git_dir, full_ref_path)
# figure out target data
target = reference
if resolve:
target = repo.rev_parse(str(reference))
if not force and osp.isfile(abs_ref_path):
target_data = str(target)
if isinstance(target, SymbolicReference):
target_data = target.path
if not resolve:
target_data = "ref: " + target_data
with open(abs_ref_path, 'rb') as fd:
existing_data = fd.read().decode(defenc).strip()
if existing_data != target_data:
raise OSError("Reference at %r does already exist, pointing to %r, requested was %r" %
(full_ref_path, existing_data, target_data))
# END no force handling
ref = cls(repo, full_ref_path)
ref.set_reference(target, logmsg)
return ref
@classmethod
def create(cls, repo, path, reference='HEAD', force=False, logmsg=None):
"""Create a new symbolic reference, hence a reference pointing to another reference.
:param repo:
Repository to create the reference in
:param path:
full path at which the new symbolic reference is supposed to be
created at, i.e. "NEW_HEAD" or "symrefs/my_new_symref"
:param reference:
The reference to which the new symbolic reference should point to.
If it is a commit'ish, the symbolic ref will be detached.
:param force:
if True, force creation even if a symbolic reference with that name already exists.
Raise OSError otherwise
:param logmsg:
If not None, the message to append to the reflog. Otherwise no reflog
entry is written.
:return: Newly created symbolic Reference
:raise OSError:
If a (Symbolic)Reference with the same name but different contents
already exists.
:note: This does not alter the current HEAD, index or Working Tree"""
return cls._create(repo, path, cls._resolve_ref_on_create, reference, force, logmsg)
def rename(self, new_path, force=False):
"""Rename self to a new path
:param new_path:
Either a simple name or a full path, i.e. new_name or features/new_name.
The prefix refs/ is implied for references and will be set as needed.
In case this is a symbolic ref, there is no implied prefix
:param force:
If True, the rename will succeed even if a head with the target name
already exists. It will be overwritten in that case
:return: self
:raise OSError: In case a file at path but a different contents already exists """
new_path = self.to_full_path(new_path)
if self.path == new_path:
return self
new_abs_path = osp.join(_git_dir(self.repo, new_path), new_path)
cur_abs_path = osp.join(_git_dir(self.repo, self.path), self.path)
if osp.isfile(new_abs_path):
if not force:
# if they point to the same file, its not an error
with open(new_abs_path, 'rb') as fd1:
f1 = fd1.read().strip()
with open(cur_abs_path, 'rb') as fd2:
f2 = fd2.read().strip()
if f1 != f2:
raise OSError("File at path %r already exists" % new_abs_path)
# else: we could remove ourselves and use the otherone, but
# but clarity we just continue as usual
# END not force handling
os.remove(new_abs_path)
# END handle existing target file
dname = osp.dirname(new_abs_path)
if not osp.isdir(dname):
os.makedirs(dname)
# END create directory
os.rename(cur_abs_path, new_abs_path)
self.path = new_path
return self
@classmethod
def _iter_items(cls, repo, common_path=None):
if common_path is None:
common_path = cls._common_path_default
rela_paths = set()
# walk loose refs
# Currently we do not follow links
for root, dirs, files in os.walk(join_path_native(repo.common_dir, common_path)):
if 'refs' not in root.split(os.sep): # skip non-refs subfolders
refs_id = [d for d in dirs if d == 'refs']
if refs_id:
dirs[0:] = ['refs']
# END prune non-refs folders
for f in files:
if f == 'packed-refs':
continue
abs_path = to_native_path_linux(join_path(root, f))
rela_paths.add(abs_path.replace(to_native_path_linux(repo.common_dir) + '/', ""))
# END for each file in root directory
# END for each directory to walk
# read packed refs
for _sha, rela_path in cls._iter_packed_refs(repo):
if rela_path.startswith(common_path):
rela_paths.add(rela_path)
# END relative path matches common path
# END packed refs reading
# return paths in sorted order
for path in sorted(rela_paths):
try:
yield cls.from_path(repo, path)
except ValueError:
continue
# END for each sorted relative refpath
@classmethod
def iter_items(cls, repo, common_path=None):
"""Find all refs in the repository
:param repo: is the Repo
:param common_path:
Optional keyword argument to the path which is to be shared by all
returned Ref objects.
Defaults to class specific portion if None assuring that only
refs suitable for the actual class are returned.
:return:
git.SymbolicReference[], each of them is guaranteed to be a symbolic
ref which is not detached and pointing to a valid ref
List is lexicographically sorted
The returned objects represent actual subclasses, such as Head or TagReference"""
return (r for r in cls._iter_items(repo, common_path) if r.__class__ == SymbolicReference or not r.is_detached)
@classmethod
def from_path(cls, repo, path):
"""
:param path: full .git-directory-relative path name to the Reference to instantiate
:note: use to_full_path() if you only have a partial path of a known Reference Type
:return:
Instance of type Reference, Head, or Tag
depending on the given path"""
if not path:
raise ValueError("Cannot create Reference from %r" % path)
# Names like HEAD are inserted after the refs module is imported - we have an import dependency
# cycle and don't want to import these names in-function
from . import HEAD, Head, RemoteReference, TagReference, Reference
for ref_type in (HEAD, Head, RemoteReference, TagReference, Reference, SymbolicReference):
try:
instance = ref_type(repo, path)
if instance.__class__ == SymbolicReference and instance.is_detached:
raise ValueError("SymbolRef was detached, we drop it")
return instance
except ValueError:
pass
# END exception handling
# END for each type to try
raise ValueError("Could not find reference type suitable to handle path %r" % path)
def is_remote(self):
""":return: True if this symbolic reference points to a remote branch"""
return self.path.startswith(self._remote_common_path_default + "/")

View File

@ -1,93 +0,0 @@
from .reference import Reference
__all__ = ["TagReference", "Tag"]
class TagReference(Reference):
"""Class representing a lightweight tag reference which either points to a commit
,a tag object or any other object. In the latter case additional information,
like the signature or the tag-creator, is available.
This tag object will always point to a commit object, but may carry additional
information in a tag object::
tagref = TagReference.list_items(repo)[0]
print(tagref.commit.message)
if tagref.tag is not None:
print(tagref.tag.message)"""
__slots__ = ()
_common_path_default = "refs/tags"
@property
def commit(self):
""":return: Commit object the tag ref points to
:raise ValueError: if the tag points to a tree or blob"""
obj = self.object
while obj.type != 'commit':
if obj.type == "tag":
# it is a tag object which carries the commit as an object - we can point to anything
obj = obj.object
else:
raise ValueError(("Cannot resolve commit as tag %s points to a %s object - " +
"use the `.object` property instead to access it") % (self, obj.type))
return obj
@property
def tag(self):
"""
:return: Tag object this tag ref points to or None in case
we are a light weight tag"""
obj = self.object
if obj.type == "tag":
return obj
return None
# make object read-only
# It should be reasonably hard to adjust an existing tag
object = property(Reference._get_object)
@classmethod
def create(cls, repo, path, ref='HEAD', message=None, force=False, **kwargs):
"""Create a new tag reference.
:param path:
The name of the tag, i.e. 1.0 or releases/1.0.
The prefix refs/tags is implied
:param ref:
A reference to the object you want to tag. It can be a commit, tree or
blob.
:param message:
If not None, the message will be used in your tag object. This will also
create an additional tag object that allows to obtain that information, i.e.::
tagref.tag.message
:param force:
If True, to force creation of a tag even though that tag already exists.
:param kwargs:
Additional keyword arguments to be passed to git-tag
:return: A new TagReference"""
args = (path, ref)
if message:
kwargs['m'] = message
if force:
kwargs['f'] = True
repo.git.tag(*args, **kwargs)
return TagReference(repo, "%s/%s" % (cls._common_path_default, path))
@classmethod
def delete(cls, repo, *tags):
"""Delete the given existing tag or tags"""
repo.git.tag("-d", *tags)
# provide an alias
Tag = TagReference

View File

@ -1,881 +0,0 @@
# remote.py
# Copyright (C) 2008, 2009 Michael Trier (mtrier@gmail.com) and contributors
#
# This module is part of GitPython and is released under
# the BSD License: http://www.opensource.org/licenses/bsd-license.php
# Module implementing a remote object allowing easy access to git remotes
import logging
import re
from git.cmd import handle_process_output, Git
from git.compat import (defenc, force_text, is_win)
from git.exc import GitCommandError
from git.util import (
LazyMixin,
Iterable,
IterableList,
RemoteProgress,
CallableRemoteProgress
)
from git.util import (
join_path,
)
import os.path as osp
from .config import (
SectionConstraint,
cp,
)
from .refs import (
Head,
Reference,
RemoteReference,
SymbolicReference,
TagReference
)
log = logging.getLogger('git.remote')
log.addHandler(logging.NullHandler())
__all__ = ('RemoteProgress', 'PushInfo', 'FetchInfo', 'Remote')
#{ Utilities
def add_progress(kwargs, git, progress):
"""Add the --progress flag to the given kwargs dict if supported by the
git command. If the actual progress in the given progress instance is not
given, we do not request any progress
:return: possibly altered kwargs"""
if progress is not None:
v = git.version_info[:2]
if v >= (1, 7):
kwargs['progress'] = True
# END handle --progress
# END handle progress
return kwargs
#} END utilities
def to_progress_instance(progress):
"""Given the 'progress' return a suitable object derived from
RemoteProgress().
"""
# new API only needs progress as a function
if callable(progress):
return CallableRemoteProgress(progress)
# where None is passed create a parser that eats the progress
elif progress is None:
return RemoteProgress()
# assume its the old API with an instance of RemoteProgress.
return progress
class PushInfo(object):
"""
Carries information about the result of a push operation of a single head::
info = remote.push()[0]
info.flags # bitflags providing more information about the result
info.local_ref # Reference pointing to the local reference that was pushed
# It is None if the ref was deleted.
info.remote_ref_string # path to the remote reference located on the remote side
info.remote_ref # Remote Reference on the local side corresponding to
# the remote_ref_string. It can be a TagReference as well.
info.old_commit # commit at which the remote_ref was standing before we pushed
# it to local_ref.commit. Will be None if an error was indicated
info.summary # summary line providing human readable english text about the push
"""
__slots__ = ('local_ref', 'remote_ref_string', 'flags', '_old_commit_sha', '_remote', 'summary')
NEW_TAG, NEW_HEAD, NO_MATCH, REJECTED, REMOTE_REJECTED, REMOTE_FAILURE, DELETED, \
FORCED_UPDATE, FAST_FORWARD, UP_TO_DATE, ERROR = [1 << x for x in range(11)]
_flag_map = {'X': NO_MATCH,
'-': DELETED,
'*': 0,
'+': FORCED_UPDATE,
' ': FAST_FORWARD,
'=': UP_TO_DATE,
'!': ERROR}
def __init__(self, flags, local_ref, remote_ref_string, remote, old_commit=None,
summary=''):
""" Initialize a new instance """
self.flags = flags
self.local_ref = local_ref
self.remote_ref_string = remote_ref_string
self._remote = remote
self._old_commit_sha = old_commit
self.summary = summary
@property
def old_commit(self):
return self._old_commit_sha and self._remote.repo.commit(self._old_commit_sha) or None
@property
def remote_ref(self):
"""
:return:
Remote Reference or TagReference in the local repository corresponding
to the remote_ref_string kept in this instance."""
# translate heads to a local remote, tags stay as they are
if self.remote_ref_string.startswith("refs/tags"):
return TagReference(self._remote.repo, self.remote_ref_string)
elif self.remote_ref_string.startswith("refs/heads"):
remote_ref = Reference(self._remote.repo, self.remote_ref_string)
return RemoteReference(self._remote.repo, "refs/remotes/%s/%s" % (str(self._remote), remote_ref.name))
else:
raise ValueError("Could not handle remote ref: %r" % self.remote_ref_string)
# END
@classmethod
def _from_line(cls, remote, line):
"""Create a new PushInfo instance as parsed from line which is expected to be like
refs/heads/master:refs/heads/master 05d2687..1d0568e as bytes"""
control_character, from_to, summary = line.split('\t', 3)
flags = 0
# control character handling
try:
flags |= cls._flag_map[control_character]
except KeyError as e:
raise ValueError("Control character %r unknown as parsed from line %r" % (control_character, line)) from e
# END handle control character
# from_to handling
from_ref_string, to_ref_string = from_to.split(':')
if flags & cls.DELETED:
from_ref = None
else:
if from_ref_string == "(delete)":
from_ref = None
else:
from_ref = Reference.from_path(remote.repo, from_ref_string)
# commit handling, could be message or commit info
old_commit = None
if summary.startswith('['):
if "[rejected]" in summary:
flags |= cls.REJECTED
elif "[remote rejected]" in summary:
flags |= cls.REMOTE_REJECTED
elif "[remote failure]" in summary:
flags |= cls.REMOTE_FAILURE
elif "[no match]" in summary:
flags |= cls.ERROR
elif "[new tag]" in summary:
flags |= cls.NEW_TAG
elif "[new branch]" in summary:
flags |= cls.NEW_HEAD
# uptodate encoded in control character
else:
# fast-forward or forced update - was encoded in control character,
# but we parse the old and new commit
split_token = "..."
if control_character == " ":
split_token = ".."
old_sha, _new_sha = summary.split(' ')[0].split(split_token)
# have to use constructor here as the sha usually is abbreviated
old_commit = old_sha
# END message handling
return PushInfo(flags, from_ref, to_ref_string, remote, old_commit, summary)
class FetchInfo(object):
"""
Carries information about the results of a fetch operation of a single head::
info = remote.fetch()[0]
info.ref # Symbolic Reference or RemoteReference to the changed
# remote head or FETCH_HEAD
info.flags # additional flags to be & with enumeration members,
# i.e. info.flags & info.REJECTED
# is 0 if ref is SymbolicReference
info.note # additional notes given by git-fetch intended for the user
info.old_commit # if info.flags & info.FORCED_UPDATE|info.FAST_FORWARD,
# field is set to the previous location of ref, otherwise None
info.remote_ref_path # The path from which we fetched on the remote. It's the remote's version of our info.ref
"""
__slots__ = ('ref', 'old_commit', 'flags', 'note', 'remote_ref_path')
NEW_TAG, NEW_HEAD, HEAD_UPTODATE, TAG_UPDATE, REJECTED, FORCED_UPDATE, \
FAST_FORWARD, ERROR = [1 << x for x in range(8)]
_re_fetch_result = re.compile(r'^\s*(.) (\[?[\w\s\.$@]+\]?)\s+(.+) -> ([^\s]+)( \(.*\)?$)?')
_flag_map = {
'!': ERROR,
'+': FORCED_UPDATE,
'*': 0,
'=': HEAD_UPTODATE,
' ': FAST_FORWARD,
'-': TAG_UPDATE,
}
@classmethod
def refresh(cls):
"""This gets called by the refresh function (see the top level
__init__).
"""
# clear the old values in _flag_map
try:
del cls._flag_map["t"]
except KeyError:
pass
try:
del cls._flag_map["-"]
except KeyError:
pass
# set the value given the git version
if Git().version_info[:2] >= (2, 10):
cls._flag_map["t"] = cls.TAG_UPDATE
else:
cls._flag_map["-"] = cls.TAG_UPDATE
return True
def __init__(self, ref, flags, note='', old_commit=None, remote_ref_path=None):
"""
Initialize a new instance
"""
self.ref = ref
self.flags = flags
self.note = note
self.old_commit = old_commit
self.remote_ref_path = remote_ref_path
def __str__(self):
return self.name
@property
def name(self):
""":return: Name of our remote ref"""
return self.ref.name
@property
def commit(self):
""":return: Commit of our remote ref"""
return self.ref.commit
@classmethod
def _from_line(cls, repo, line, fetch_line):
"""Parse information from the given line as returned by git-fetch -v
and return a new FetchInfo object representing this information.
We can handle a line as follows
"%c %-*s %-*s -> %s%s"
Where c is either ' ', !, +, -, *, or =
! means error
+ means success forcing update
- means a tag was updated
* means birth of new branch or tag
= means the head was up to date ( and not moved )
' ' means a fast-forward
fetch line is the corresponding line from FETCH_HEAD, like
acb0fa8b94ef421ad60c8507b634759a472cd56c not-for-merge branch '0.1.7RC' of /tmp/tmpya0vairemote_repo"""
match = cls._re_fetch_result.match(line)
if match is None:
raise ValueError("Failed to parse line: %r" % line)
# parse lines
control_character, operation, local_remote_ref, remote_local_ref, note = match.groups()
try:
_new_hex_sha, _fetch_operation, fetch_note = fetch_line.split("\t")
ref_type_name, fetch_note = fetch_note.split(' ', 1)
except ValueError as e: # unpack error
raise ValueError("Failed to parse FETCH_HEAD line: %r" % fetch_line) from e
# parse flags from control_character
flags = 0
try:
flags |= cls._flag_map[control_character]
except KeyError as e:
raise ValueError("Control character %r unknown as parsed from line %r" % (control_character, line)) from e
# END control char exception handling
# parse operation string for more info - makes no sense for symbolic refs, but we parse it anyway
old_commit = None
is_tag_operation = False
if 'rejected' in operation:
flags |= cls.REJECTED
if 'new tag' in operation:
flags |= cls.NEW_TAG
is_tag_operation = True
if 'tag update' in operation:
flags |= cls.TAG_UPDATE
is_tag_operation = True
if 'new branch' in operation:
flags |= cls.NEW_HEAD
if '...' in operation or '..' in operation:
split_token = '...'
if control_character == ' ':
split_token = split_token[:-1]
old_commit = repo.rev_parse(operation.split(split_token)[0])
# END handle refspec
# handle FETCH_HEAD and figure out ref type
# If we do not specify a target branch like master:refs/remotes/origin/master,
# the fetch result is stored in FETCH_HEAD which destroys the rule we usually
# have. In that case we use a symbolic reference which is detached
ref_type = None
if remote_local_ref == "FETCH_HEAD":
ref_type = SymbolicReference
elif ref_type_name == "tag" or is_tag_operation:
# the ref_type_name can be branch, whereas we are still seeing a tag operation. It happens during
# testing, which is based on actual git operations
ref_type = TagReference
elif ref_type_name in ("remote-tracking", "branch"):
# note: remote-tracking is just the first part of the 'remote-tracking branch' token.
# We don't parse it correctly, but its enough to know what to do, and its new in git 1.7something
ref_type = RemoteReference
elif '/' in ref_type_name:
# If the fetch spec look something like this '+refs/pull/*:refs/heads/pull/*', and is thus pretty
# much anything the user wants, we will have trouble to determine what's going on
# For now, we assume the local ref is a Head
ref_type = Head
else:
raise TypeError("Cannot handle reference type: %r" % ref_type_name)
# END handle ref type
# create ref instance
if ref_type is SymbolicReference:
remote_local_ref = ref_type(repo, "FETCH_HEAD")
else:
# determine prefix. Tags are usually pulled into refs/tags, they may have subdirectories.
# It is not clear sometimes where exactly the item is, unless we have an absolute path as indicated
# by the 'ref/' prefix. Otherwise even a tag could be in refs/remotes, which is when it will have the
# 'tags/' subdirectory in its path.
# We don't want to test for actual existence, but try to figure everything out analytically.
ref_path = None
remote_local_ref = remote_local_ref.strip()
if remote_local_ref.startswith(Reference._common_path_default + "/"):
# always use actual type if we get absolute paths
# Will always be the case if something is fetched outside of refs/remotes (if its not a tag)
ref_path = remote_local_ref
if ref_type is not TagReference and not \
remote_local_ref.startswith(RemoteReference._common_path_default + "/"):
ref_type = Reference
# END downgrade remote reference
elif ref_type is TagReference and 'tags/' in remote_local_ref:
# even though its a tag, it is located in refs/remotes
ref_path = join_path(RemoteReference._common_path_default, remote_local_ref)
else:
ref_path = join_path(ref_type._common_path_default, remote_local_ref)
# END obtain refpath
# even though the path could be within the git conventions, we make
# sure we respect whatever the user wanted, and disabled path checking
remote_local_ref = ref_type(repo, ref_path, check_path=False)
# END create ref instance
note = (note and note.strip()) or ''
return cls(remote_local_ref, flags, note, old_commit, local_remote_ref)
class Remote(LazyMixin, Iterable):
"""Provides easy read and write access to a git remote.
Everything not part of this interface is considered an option for the current
remote, allowing constructs like remote.pushurl to query the pushurl.
NOTE: When querying configuration, the configuration accessor will be cached
to speed up subsequent accesses."""
__slots__ = ("repo", "name", "_config_reader")
_id_attribute_ = "name"
def __init__(self, repo, name):
"""Initialize a remote instance
:param repo: The repository we are a remote of
:param name: the name of the remote, i.e. 'origin'"""
self.repo = repo
self.name = name
if is_win:
# some oddity: on windows, python 2.5, it for some reason does not realize
# that it has the config_writer property, but instead calls __getattr__
# which will not yield the expected results. 'pinging' the members
# with a dir call creates the config_writer property that we require
# ... bugs like these make me wonder whether python really wants to be used
# for production. It doesn't happen on linux though.
dir(self)
# END windows special handling
def __getattr__(self, attr):
"""Allows to call this instance like
remote.special( \\*args, \\*\\*kwargs) to call git-remote special self.name"""
if attr == "_config_reader":
return super(Remote, self).__getattr__(attr)
# sometimes, probably due to a bug in python itself, we are being called
# even though a slot of the same name exists
try:
return self._config_reader.get(attr)
except cp.NoOptionError:
return super(Remote, self).__getattr__(attr)
# END handle exception
def _config_section_name(self):
return 'remote "%s"' % self.name
def _set_cache_(self, attr):
if attr == "_config_reader":
# NOTE: This is cached as __getattr__ is overridden to return remote config values implicitly, such as
# in print(r.pushurl)
self._config_reader = SectionConstraint(self.repo.config_reader("repository"), self._config_section_name())
else:
super(Remote, self)._set_cache_(attr)
def __str__(self):
return self.name
def __repr__(self):
return '<git.%s "%s">' % (self.__class__.__name__, self.name)
def __eq__(self, other):
return isinstance(other, type(self)) and self.name == other.name
def __ne__(self, other):
return not (self == other)
def __hash__(self):
return hash(self.name)
def exists(self):
"""
:return: True if this is a valid, existing remote.
Valid remotes have an entry in the repository's configuration"""
try:
self.config_reader.get('url')
return True
except cp.NoOptionError:
# we have the section at least ...
return True
except cp.NoSectionError:
return False
# end
@classmethod
def iter_items(cls, repo):
""":return: Iterator yielding Remote objects of the given repository"""
for section in repo.config_reader("repository").sections():
if not section.startswith('remote '):
continue
lbound = section.find('"')
rbound = section.rfind('"')
if lbound == -1 or rbound == -1:
raise ValueError("Remote-Section has invalid format: %r" % section)
yield Remote(repo, section[lbound + 1:rbound])
# END for each configuration section
def set_url(self, new_url, old_url=None, **kwargs):
"""Configure URLs on current remote (cf command git remote set_url)
This command manages URLs on the remote.
:param new_url: string being the URL to add as an extra remote URL
:param old_url: when set, replaces this URL with new_url for the remote
:return: self
"""
scmd = 'set-url'
kwargs['insert_kwargs_after'] = scmd
if old_url:
self.repo.git.remote(scmd, self.name, new_url, old_url, **kwargs)
else:
self.repo.git.remote(scmd, self.name, new_url, **kwargs)
return self
def add_url(self, url, **kwargs):
"""Adds a new url on current remote (special case of git remote set_url)
This command adds new URLs to a given remote, making it possible to have
multiple URLs for a single remote.
:param url: string being the URL to add as an extra remote URL
:return: self
"""
return self.set_url(url, add=True)
def delete_url(self, url, **kwargs):
"""Deletes a new url on current remote (special case of git remote set_url)
This command deletes new URLs to a given remote, making it possible to have
multiple URLs for a single remote.
:param url: string being the URL to delete from the remote
:return: self
"""
return self.set_url(url, delete=True)
@property
def urls(self):
""":return: Iterator yielding all configured URL targets on a remote as strings"""
try:
remote_details = self.repo.git.remote("get-url", "--all", self.name)
for line in remote_details.split('\n'):
yield line
except GitCommandError as ex:
## We are on git < 2.7 (i.e TravisCI as of Oct-2016),
# so `get-utl` command does not exist yet!
# see: https://github.com/gitpython-developers/GitPython/pull/528#issuecomment-252976319
# and: http://stackoverflow.com/a/32991784/548792
#
if 'Unknown subcommand: get-url' in str(ex):
try:
remote_details = self.repo.git.remote("show", self.name)
for line in remote_details.split('\n'):
if ' Push URL:' in line:
yield line.split(': ')[-1]
except GitCommandError as ex:
if any(msg in str(ex) for msg in ['correct access rights', 'cannot run ssh']):
# If ssh is not setup to access this repository, see issue 694
remote_details = self.repo.git.config('--get-all', 'remote.%s.url' % self.name)
for line in remote_details.split('\n'):
yield line
else:
raise ex
else:
raise ex
@property
def refs(self):
"""
:return:
IterableList of RemoteReference objects. It is prefixed, allowing
you to omit the remote path portion, i.e.::
remote.refs.master # yields RemoteReference('/refs/remotes/origin/master')"""
out_refs = IterableList(RemoteReference._id_attribute_, "%s/" % self.name)
out_refs.extend(RemoteReference.list_items(self.repo, remote=self.name))
return out_refs
@property
def stale_refs(self):
"""
:return:
IterableList RemoteReference objects that do not have a corresponding
head in the remote reference anymore as they have been deleted on the
remote side, but are still available locally.
The IterableList is prefixed, hence the 'origin' must be omitted. See
'refs' property for an example.
To make things more complicated, it can be possible for the list to include
other kinds of references, for example, tag references, if these are stale
as well. This is a fix for the issue described here:
https://github.com/gitpython-developers/GitPython/issues/260
"""
out_refs = IterableList(RemoteReference._id_attribute_, "%s/" % self.name)
for line in self.repo.git.remote("prune", "--dry-run", self).splitlines()[2:]:
# expecting
# * [would prune] origin/new_branch
token = " * [would prune] "
if not line.startswith(token):
raise ValueError("Could not parse git-remote prune result: %r" % line)
ref_name = line.replace(token, "")
# sometimes, paths start with a full ref name, like refs/tags/foo, see #260
if ref_name.startswith(Reference._common_path_default + '/'):
out_refs.append(SymbolicReference.from_path(self.repo, ref_name))
else:
fqhn = "%s/%s" % (RemoteReference._common_path_default, ref_name)
out_refs.append(RemoteReference(self.repo, fqhn))
# end special case handling
# END for each line
return out_refs
@classmethod
def create(cls, repo, name, url, **kwargs):
"""Create a new remote to the given repository
:param repo: Repository instance that is to receive the new remote
:param name: Desired name of the remote
:param url: URL which corresponds to the remote's name
:param kwargs: Additional arguments to be passed to the git-remote add command
:return: New Remote instance
:raise GitCommandError: in case an origin with that name already exists"""
scmd = 'add'
kwargs['insert_kwargs_after'] = scmd
repo.git.remote(scmd, name, Git.polish_url(url), **kwargs)
return cls(repo, name)
# add is an alias
add = create
@classmethod
def remove(cls, repo, name):
"""Remove the remote with the given name
:return: the passed remote name to remove
"""
repo.git.remote("rm", name)
if isinstance(name, cls):
name._clear_cache()
return name
# alias
rm = remove
def rename(self, new_name):
"""Rename self to the given new_name
:return: self """
if self.name == new_name:
return self
self.repo.git.remote("rename", self.name, new_name)
self.name = new_name
self._clear_cache()
return self
def update(self, **kwargs):
"""Fetch all changes for this remote, including new branches which will
be forced in ( in case your local remote branch is not part the new remote branches
ancestry anymore ).
:param kwargs:
Additional arguments passed to git-remote update
:return: self """
scmd = 'update'
kwargs['insert_kwargs_after'] = scmd
self.repo.git.remote(scmd, self.name, **kwargs)
return self
def _get_fetch_info_from_stderr(self, proc, progress):
progress = to_progress_instance(progress)
# skip first line as it is some remote info we are not interested in
output = IterableList('name')
# lines which are no progress are fetch info lines
# this also waits for the command to finish
# Skip some progress lines that don't provide relevant information
fetch_info_lines = []
# Basically we want all fetch info lines which appear to be in regular form, and thus have a
# command character. Everything else we ignore,
cmds = set(FetchInfo._flag_map.keys())
progress_handler = progress.new_message_handler()
handle_process_output(proc, None, progress_handler, finalizer=None, decode_streams=False)
stderr_text = progress.error_lines and '\n'.join(progress.error_lines) or ''
proc.wait(stderr=stderr_text)
if stderr_text:
log.warning("Error lines received while fetching: %s", stderr_text)
for line in progress.other_lines:
line = force_text(line)
for cmd in cmds:
if len(line) > 1 and line[0] == ' ' and line[1] == cmd:
fetch_info_lines.append(line)
continue
# read head information
with open(osp.join(self.repo.common_dir, 'FETCH_HEAD'), 'rb') as fp:
fetch_head_info = [line.decode(defenc) for line in fp.readlines()]
l_fil = len(fetch_info_lines)
l_fhi = len(fetch_head_info)
if l_fil != l_fhi:
msg = "Fetch head lines do not match lines provided via progress information\n"
msg += "length of progress lines %i should be equal to lines in FETCH_HEAD file %i\n"
msg += "Will ignore extra progress lines or fetch head lines."
msg %= (l_fil, l_fhi)
log.debug(msg)
log.debug("info lines: " + str(fetch_info_lines))
log.debug("head info : " + str(fetch_head_info))
if l_fil < l_fhi:
fetch_head_info = fetch_head_info[:l_fil]
else:
fetch_info_lines = fetch_info_lines[:l_fhi]
# end truncate correct list
# end sanity check + sanitization
for err_line, fetch_line in zip(fetch_info_lines, fetch_head_info):
try:
output.append(FetchInfo._from_line(self.repo, err_line, fetch_line))
except ValueError as exc:
log.debug("Caught error while parsing line: %s", exc)
log.warning("Git informed while fetching: %s", err_line.strip())
return output
def _get_push_info(self, proc, progress):
progress = to_progress_instance(progress)
# read progress information from stderr
# we hope stdout can hold all the data, it should ...
# read the lines manually as it will use carriage returns between the messages
# to override the previous one. This is why we read the bytes manually
progress_handler = progress.new_message_handler()
output = []
def stdout_handler(line):
try:
output.append(PushInfo._from_line(self, line))
except ValueError:
# If an error happens, additional info is given which we parse below.
pass
handle_process_output(proc, stdout_handler, progress_handler, finalizer=None, decode_streams=False)
stderr_text = progress.error_lines and '\n'.join(progress.error_lines) or ''
try:
proc.wait(stderr=stderr_text)
except Exception:
if not output:
raise
elif stderr_text:
log.warning("Error lines received while fetching: %s", stderr_text)
return output
def _assert_refspec(self):
"""Turns out we can't deal with remotes if the refspec is missing"""
config = self.config_reader
unset = 'placeholder'
try:
if config.get_value('fetch', default=unset) is unset:
msg = "Remote '%s' has no refspec set.\n"
msg += "You can set it as follows:"
msg += " 'git config --add \"remote.%s.fetch +refs/heads/*:refs/heads/*\"'."
raise AssertionError(msg % (self.name, self.name))
finally:
config.release()
def fetch(self, refspec=None, progress=None, verbose=True, **kwargs):
"""Fetch the latest changes for this remote
:param refspec:
A "refspec" is used by fetch and push to describe the mapping
between remote ref and local ref. They are combined with a colon in
the format <src>:<dst>, preceded by an optional plus sign, +.
For example: git fetch $URL refs/heads/master:refs/heads/origin means
"grab the master branch head from the $URL and store it as my origin
branch head". And git push $URL refs/heads/master:refs/heads/to-upstream
means "publish my master branch head as to-upstream branch at $URL".
See also git-push(1).
Taken from the git manual
Fetch supports multiple refspecs (as the
underlying git-fetch does) - supplying a list rather than a string
for 'refspec' will make use of this facility.
:param progress: See 'push' method
:param verbose: Boolean for verbose output
:param kwargs: Additional arguments to be passed to git-fetch
:return:
IterableList(FetchInfo, ...) list of FetchInfo instances providing detailed
information about the fetch results
:note:
As fetch does not provide progress information to non-ttys, we cannot make
it available here unfortunately as in the 'push' method."""
if refspec is None:
# No argument refspec, then ensure the repo's config has a fetch refspec.
self._assert_refspec()
kwargs = add_progress(kwargs, self.repo.git, progress)
if isinstance(refspec, list):
args = refspec
else:
args = [refspec]
proc = self.repo.git.fetch(self, *args, as_process=True, with_stdout=False,
universal_newlines=True, v=verbose, **kwargs)
res = self._get_fetch_info_from_stderr(proc, progress)
if hasattr(self.repo.odb, 'update_cache'):
self.repo.odb.update_cache()
return res
def pull(self, refspec=None, progress=None, **kwargs):
"""Pull changes from the given branch, being the same as a fetch followed
by a merge of branch with your local branch.
:param refspec: see 'fetch' method
:param progress: see 'push' method
:param kwargs: Additional arguments to be passed to git-pull
:return: Please see 'fetch' method """
if refspec is None:
# No argument refspec, then ensure the repo's config has a fetch refspec.
self._assert_refspec()
kwargs = add_progress(kwargs, self.repo.git, progress)
proc = self.repo.git.pull(self, refspec, with_stdout=False, as_process=True,
universal_newlines=True, v=True, **kwargs)
res = self._get_fetch_info_from_stderr(proc, progress)
if hasattr(self.repo.odb, 'update_cache'):
self.repo.odb.update_cache()
return res
def push(self, refspec=None, progress=None, **kwargs):
"""Push changes from source branch in refspec to target branch in refspec.
:param refspec: see 'fetch' method
:param progress:
Can take one of many value types:
* None to discard progress information
* A function (callable) that is called with the progress information.
Signature: ``progress(op_code, cur_count, max_count=None, message='')``.
`Click here <http://goo.gl/NPa7st>`__ for a description of all arguments
given to the function.
* An instance of a class derived from ``git.RemoteProgress`` that
overrides the ``update()`` function.
:note: No further progress information is returned after push returns.
:param kwargs: Additional arguments to be passed to git-push
:return:
list(PushInfo, ...) list of PushInfo instances, each
one informing about an individual head which had been updated on the remote
side.
If the push contains rejected heads, these will have the PushInfo.ERROR bit set
in their flags.
If the operation fails completely, the length of the returned IterableList will
be 0."""
kwargs = add_progress(kwargs, self.repo.git, progress)
proc = self.repo.git.push(self, refspec, porcelain=True, as_process=True,
universal_newlines=True, **kwargs)
return self._get_push_info(proc, progress)
@property
def config_reader(self):
"""
:return:
GitConfigParser compatible object able to read options for only our remote.
Hence you may simple type config.get("pushurl") to obtain the information"""
return self._config_reader
def _clear_cache(self):
try:
del(self._config_reader)
except AttributeError:
pass
# END handle exception
@property
def config_writer(self):
"""
:return: GitConfigParser compatible object able to write options for this remote.
:note:
You can only own one writer at a time - delete it to release the
configuration file and make it usable by others.
To assure consistent results, you should only query options through the
writer. Once you are done writing, you are free to use the config reader
once again."""
writer = self.repo.config_writer()
# clear our cache to assure we re-read the possibly changed configuration
self._clear_cache()
return SectionConstraint(writer, self._config_section_name())

View File

@ -1,4 +0,0 @@
"""Initialize the Repo package"""
# flake8: noqa
from __future__ import absolute_import
from .base import *

View File

@ -1,346 +0,0 @@
"""Package with general repository related functions"""
import os
import stat
from string import digits
from git.exc import WorkTreeRepositoryUnsupported
from git.objects import Object
from git.refs import SymbolicReference
from git.util import hex_to_bin, bin_to_hex, decygpath
from gitdb.exc import (
BadObject,
BadName,
)
import os.path as osp
from git.cmd import Git
__all__ = ('rev_parse', 'is_git_dir', 'touch', 'find_submodule_git_dir', 'name_to_object', 'short_to_long', 'deref_tag',
'to_commit', 'find_worktree_git_dir')
def touch(filename):
with open(filename, "ab"):
pass
return filename
def is_git_dir(d):
""" This is taken from the git setup.c:is_git_directory
function.
@throws WorkTreeRepositoryUnsupported if it sees a worktree directory. It's quite hacky to do that here,
but at least clearly indicates that we don't support it.
There is the unlikely danger to throw if we see directories which just look like a worktree dir,
but are none."""
if osp.isdir(d):
if (osp.isdir(osp.join(d, 'objects')) or 'GIT_OBJECT_DIRECTORY' in os.environ) \
and osp.isdir(osp.join(d, 'refs')):
headref = osp.join(d, 'HEAD')
return osp.isfile(headref) or \
(osp.islink(headref) and
os.readlink(headref).startswith('refs'))
elif (osp.isfile(osp.join(d, 'gitdir')) and
osp.isfile(osp.join(d, 'commondir')) and
osp.isfile(osp.join(d, 'gitfile'))):
raise WorkTreeRepositoryUnsupported(d)
return False
def find_worktree_git_dir(dotgit):
"""Search for a gitdir for this worktree."""
try:
statbuf = os.stat(dotgit)
except OSError:
return None
if not stat.S_ISREG(statbuf.st_mode):
return None
try:
lines = open(dotgit, 'r').readlines()
for key, value in [line.strip().split(': ') for line in lines]:
if key == 'gitdir':
return value
except ValueError:
pass
return None
def find_submodule_git_dir(d):
"""Search for a submodule repo."""
if is_git_dir(d):
return d
try:
with open(d) as fp:
content = fp.read().rstrip()
except (IOError, OSError):
# it's probably not a file
pass
else:
if content.startswith('gitdir: '):
path = content[8:]
if Git.is_cygwin():
## Cygwin creates submodules prefixed with `/cygdrive/...` suffixes.
path = decygpath(path)
if not osp.isabs(path):
path = osp.normpath(osp.join(osp.dirname(d), path))
return find_submodule_git_dir(path)
# end handle exception
return None
def short_to_long(odb, hexsha):
""":return: long hexadecimal sha1 from the given less-than-40 byte hexsha
or None if no candidate could be found.
:param hexsha: hexsha with less than 40 byte"""
try:
return bin_to_hex(odb.partial_to_complete_sha_hex(hexsha))
except BadObject:
return None
# END exception handling
def name_to_object(repo, name, return_ref=False):
"""
:return: object specified by the given name, hexshas ( short and long )
as well as references are supported
:param return_ref: if name specifies a reference, we will return the reference
instead of the object. Otherwise it will raise BadObject or BadName
"""
hexsha = None
# is it a hexsha ? Try the most common ones, which is 7 to 40
if repo.re_hexsha_shortened.match(name):
if len(name) != 40:
# find long sha for short sha
hexsha = short_to_long(repo.odb, name)
else:
hexsha = name
# END handle short shas
# END find sha if it matches
# if we couldn't find an object for what seemed to be a short hexsha
# try to find it as reference anyway, it could be named 'aaa' for instance
if hexsha is None:
for base in ('%s', 'refs/%s', 'refs/tags/%s', 'refs/heads/%s', 'refs/remotes/%s', 'refs/remotes/%s/HEAD'):
try:
hexsha = SymbolicReference.dereference_recursive(repo, base % name)
if return_ref:
return SymbolicReference(repo, base % name)
# END handle symbolic ref
break
except ValueError:
pass
# END for each base
# END handle hexsha
# didn't find any ref, this is an error
if return_ref:
raise BadObject("Couldn't find reference named %r" % name)
# END handle return ref
# tried everything ? fail
if hexsha is None:
raise BadName(name)
# END assert hexsha was found
return Object.new_from_sha(repo, hex_to_bin(hexsha))
def deref_tag(tag):
"""Recursively dereference a tag and return the resulting object"""
while True:
try:
tag = tag.object
except AttributeError:
break
# END dereference tag
return tag
def to_commit(obj):
"""Convert the given object to a commit if possible and return it"""
if obj.type == 'tag':
obj = deref_tag(obj)
if obj.type != "commit":
raise ValueError("Cannot convert object %r to type commit" % obj)
# END verify type
return obj
def rev_parse(repo, rev):
"""
:return: Object at the given revision, either Commit, Tag, Tree or Blob
:param rev: git-rev-parse compatible revision specification as string, please see
http://www.kernel.org/pub/software/scm/git/docs/git-rev-parse.html
for details
:raise BadObject: if the given revision could not be found
:raise ValueError: If rev couldn't be parsed
:raise IndexError: If invalid reflog index is specified"""
# colon search mode ?
if rev.startswith(':/'):
# colon search mode
raise NotImplementedError("commit by message search ( regex )")
# END handle search
obj = None
ref = None
output_type = "commit"
start = 0
parsed_to = 0
lr = len(rev)
while start < lr:
if rev[start] not in "^~:@":
start += 1
continue
# END handle start
token = rev[start]
if obj is None:
# token is a rev name
if start == 0:
ref = repo.head.ref
else:
if token == '@':
ref = name_to_object(repo, rev[:start], return_ref=True)
else:
obj = name_to_object(repo, rev[:start])
# END handle token
# END handle refname
if ref is not None:
obj = ref.commit
# END handle ref
# END initialize obj on first token
start += 1
# try to parse {type}
if start < lr and rev[start] == '{':
end = rev.find('}', start)
if end == -1:
raise ValueError("Missing closing brace to define type in %s" % rev)
output_type = rev[start + 1:end] # exclude brace
# handle type
if output_type == 'commit':
pass # default
elif output_type == 'tree':
try:
obj = to_commit(obj).tree
except (AttributeError, ValueError):
pass # error raised later
# END exception handling
elif output_type in ('', 'blob'):
if obj.type == 'tag':
obj = deref_tag(obj)
else:
# cannot do anything for non-tags
pass
# END handle tag
elif token == '@':
# try single int
assert ref is not None, "Requre Reference to access reflog"
revlog_index = None
try:
# transform reversed index into the format of our revlog
revlog_index = -(int(output_type) + 1)
except ValueError as e:
# TODO: Try to parse the other date options, using parse_date
# maybe
raise NotImplementedError("Support for additional @{...} modes not implemented") from e
# END handle revlog index
try:
entry = ref.log_entry(revlog_index)
except IndexError as e:
raise IndexError("Invalid revlog index: %i" % revlog_index) from e
# END handle index out of bound
obj = Object.new_from_sha(repo, hex_to_bin(entry.newhexsha))
# make it pass the following checks
output_type = None
else:
raise ValueError("Invalid output type: %s ( in %s )" % (output_type, rev))
# END handle output type
# empty output types don't require any specific type, its just about dereferencing tags
if output_type and obj.type != output_type:
raise ValueError("Could not accommodate requested object type %r, got %s" % (output_type, obj.type))
# END verify output type
start = end + 1 # skip brace
parsed_to = start
continue
# END parse type
# try to parse a number
num = 0
if token != ":":
found_digit = False
while start < lr:
if rev[start] in digits:
num = num * 10 + int(rev[start])
start += 1
found_digit = True
else:
break
# END handle number
# END number parse loop
# no explicit number given, 1 is the default
# It could be 0 though
if not found_digit:
num = 1
# END set default num
# END number parsing only if non-blob mode
parsed_to = start
# handle hierarchy walk
try:
if token == "~":
obj = to_commit(obj)
for _ in range(num):
obj = obj.parents[0]
# END for each history item to walk
elif token == "^":
obj = to_commit(obj)
# must be n'th parent
if num:
obj = obj.parents[num - 1]
elif token == ":":
if obj.type != "tree":
obj = obj.tree
# END get tree type
obj = obj[rev[start:]]
parsed_to = lr
else:
raise ValueError("Invalid token: %r" % token)
# END end handle tag
except (IndexError, AttributeError) as e:
raise BadName(
"Invalid revision spec '%s' - not enough "
"parent commits to reach '%s%i'" % (rev, token, num)) from e
# END exception handling
# END parse loop
# still no obj ? Its probably a simple name
if obj is None:
obj = name_to_object(repo, rev)
parsed_to = lr
# END handle simple name
if obj is None:
raise ValueError("Revision specifier could not be parsed: %s" % rev)
if parsed_to != lr:
raise ValueError("Didn't consume complete rev spec %s, consumed part: %s" % (rev, rev[:parsed_to]))
return obj

View File

@ -1,944 +0,0 @@
# utils.py
# Copyright (C) 2008, 2009 Michael Trier (mtrier@gmail.com) and contributors
#
# This module is part of GitPython and is released under
# the BSD License: http://www.opensource.org/licenses/bsd-license.php
import contextlib
from functools import wraps
import getpass
import logging
import os
import platform
import subprocess
import re
import shutil
import stat
from sys import maxsize
import time
from unittest import SkipTest
from gitdb.util import (# NOQA @IgnorePep8
make_sha,
LockedFD, # @UnusedImport
file_contents_ro, # @UnusedImport
file_contents_ro_filepath, # @UnusedImport
LazyMixin, # @UnusedImport
to_hex_sha, # @UnusedImport
to_bin_sha, # @UnusedImport
bin_to_hex, # @UnusedImport
hex_to_bin, # @UnusedImport
)
from git.compat import is_win
import os.path as osp
from .exc import InvalidGitRepositoryError
# NOTE: Some of the unused imports might be used/imported by others.
# Handle once test-cases are back up and running.
# Most of these are unused here, but are for use by git-python modules so these
# don't see gitdb all the time. Flake of course doesn't like it.
__all__ = ["stream_copy", "join_path", "to_native_path_linux",
"join_path_native", "Stats", "IndexFileSHA1Writer", "Iterable", "IterableList",
"BlockingLockFile", "LockFile", 'Actor', 'get_user_id', 'assure_directory_exists',
'RemoteProgress', 'CallableRemoteProgress', 'rmtree', 'unbare_repo',
'HIDE_WINDOWS_KNOWN_ERRORS']
log = logging.getLogger(__name__)
#: We need an easy way to see if Appveyor TCs start failing,
#: so the errors marked with this var are considered "acknowledged" ones, awaiting remedy,
#: till then, we wish to hide them.
HIDE_WINDOWS_KNOWN_ERRORS = is_win and os.environ.get('HIDE_WINDOWS_KNOWN_ERRORS', True)
HIDE_WINDOWS_FREEZE_ERRORS = is_win and os.environ.get('HIDE_WINDOWS_FREEZE_ERRORS', True)
#{ Utility Methods
def unbare_repo(func):
"""Methods with this decorator raise InvalidGitRepositoryError if they
encounter a bare repository"""
@wraps(func)
def wrapper(self, *args, **kwargs):
if self.repo.bare:
raise InvalidGitRepositoryError("Method '%s' cannot operate on bare repositories" % func.__name__)
# END bare method
return func(self, *args, **kwargs)
# END wrapper
return wrapper
@contextlib.contextmanager
def cwd(new_dir):
old_dir = os.getcwd()
os.chdir(new_dir)
try:
yield new_dir
finally:
os.chdir(old_dir)
def rmtree(path):
"""Remove the given recursively.
:note: we use shutil rmtree but adjust its behaviour to see whether files that
couldn't be deleted are read-only. Windows will not remove them in that case"""
def onerror(func, path, exc_info):
# Is the error an access error ?
os.chmod(path, stat.S_IWUSR)
try:
func(path) # Will scream if still not possible to delete.
except Exception as ex:
if HIDE_WINDOWS_KNOWN_ERRORS:
raise SkipTest("FIXME: fails with: PermissionError\n {}".format(ex)) from ex
raise
return shutil.rmtree(path, False, onerror)
def rmfile(path):
"""Ensure file deleted also on *Windows* where read-only files need special treatment."""
if osp.isfile(path):
if is_win:
os.chmod(path, 0o777)
os.remove(path)
def stream_copy(source, destination, chunk_size=512 * 1024):
"""Copy all data from the source stream into the destination stream in chunks
of size chunk_size
:return: amount of bytes written"""
br = 0
while True:
chunk = source.read(chunk_size)
destination.write(chunk)
br += len(chunk)
if len(chunk) < chunk_size:
break
# END reading output stream
return br
def join_path(a, *p):
"""Join path tokens together similar to osp.join, but always use
'/' instead of possibly '\' on windows."""
path = a
for b in p:
if not b:
continue
if b.startswith('/'):
path += b[1:]
elif path == '' or path.endswith('/'):
path += b
else:
path += '/' + b
# END for each path token to add
return path
if is_win:
def to_native_path_windows(path):
return path.replace('/', '\\')
def to_native_path_linux(path):
return path.replace('\\', '/')
__all__.append("to_native_path_windows")
to_native_path = to_native_path_windows
else:
# no need for any work on linux
def to_native_path_linux(path):
return path
to_native_path = to_native_path_linux
def join_path_native(a, *p):
"""
As join path, but makes sure an OS native path is returned. This is only
needed to play it safe on my dear windows and to assure nice paths that only
use '\'"""
return to_native_path(join_path(a, *p))
def assure_directory_exists(path, is_file=False):
"""Assure that the directory pointed to by path exists.
:param is_file: If True, path is assumed to be a file and handled correctly.
Otherwise it must be a directory
:return: True if the directory was created, False if it already existed"""
if is_file:
path = osp.dirname(path)
# END handle file
if not osp.isdir(path):
os.makedirs(path, exist_ok=True)
return True
return False
def _get_exe_extensions():
PATHEXT = os.environ.get('PATHEXT', None)
return tuple(p.upper() for p in PATHEXT.split(os.pathsep)) \
if PATHEXT \
else (('.BAT', 'COM', '.EXE') if is_win else ())
def py_where(program, path=None):
# From: http://stackoverflow.com/a/377028/548792
winprog_exts = _get_exe_extensions()
def is_exec(fpath):
return osp.isfile(fpath) and os.access(fpath, os.X_OK) and (
os.name != 'nt' or not winprog_exts or any(fpath.upper().endswith(ext)
for ext in winprog_exts))
progs = []
if not path:
path = os.environ["PATH"]
for folder in path.split(os.pathsep):
folder = folder.strip('"')
if folder:
exe_path = osp.join(folder, program)
for f in [exe_path] + ['%s%s' % (exe_path, e) for e in winprog_exts]:
if is_exec(f):
progs.append(f)
return progs
def _cygexpath(drive, path):
if osp.isabs(path) and not drive:
## Invoked from `cygpath()` directly with `D:Apps\123`?
# It's an error, leave it alone just slashes)
p = path
else:
p = path and osp.normpath(osp.expandvars(osp.expanduser(path)))
if osp.isabs(p):
if drive:
# Confusing, maybe a remote system should expand vars.
p = path
else:
p = cygpath(p)
elif drive:
p = '/cygdrive/%s/%s' % (drive.lower(), p)
return p.replace('\\', '/')
_cygpath_parsers = (
## See: https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247(v=vs.85).aspx
## and: https://www.cygwin.com/cygwin-ug-net/using.html#unc-paths
(re.compile(r"\\\\\?\\UNC\\([^\\]+)\\([^\\]+)(?:\\(.*))?"),
(lambda server, share, rest_path: '//%s/%s/%s' % (server, share, rest_path.replace('\\', '/'))),
False
),
(re.compile(r"\\\\\?\\(\w):[/\\](.*)"),
_cygexpath,
False
),
(re.compile(r"(\w):[/\\](.*)"),
_cygexpath,
False
),
(re.compile(r"file:(.*)", re.I),
(lambda rest_path: rest_path),
True),
(re.compile(r"(\w{2,}:.*)"), # remote URL, do nothing
(lambda url: url),
False),
)
def cygpath(path):
"""Use :meth:`git.cmd.Git.polish_url()` instead, that works on any environment."""
if not path.startswith(('/cygdrive', '//')):
for regex, parser, recurse in _cygpath_parsers:
match = regex.match(path)
if match:
path = parser(*match.groups())
if recurse:
path = cygpath(path)
break
else:
path = _cygexpath(None, path)
return path
_decygpath_regex = re.compile(r"/cygdrive/(\w)(/.*)?")
def decygpath(path):
m = _decygpath_regex.match(path)
if m:
drive, rest_path = m.groups()
path = '%s:%s' % (drive.upper(), rest_path or '')
return path.replace('/', '\\')
#: Store boolean flags denoting if a specific Git executable
#: is from a Cygwin installation (since `cache_lru()` unsupported on PY2).
_is_cygwin_cache = {}
def is_cygwin_git(git_executable):
if not is_win:
return False
#from subprocess import check_output
is_cygwin = _is_cygwin_cache.get(git_executable)
if is_cygwin is None:
is_cygwin = False
try:
git_dir = osp.dirname(git_executable)
if not git_dir:
res = py_where(git_executable)
git_dir = osp.dirname(res[0]) if res else None
## Just a name given, not a real path.
uname_cmd = osp.join(git_dir, 'uname')
process = subprocess.Popen([uname_cmd], stdout=subprocess.PIPE,
universal_newlines=True)
uname_out, _ = process.communicate()
#retcode = process.poll()
is_cygwin = 'CYGWIN' in uname_out
except Exception as ex:
log.debug('Failed checking if running in CYGWIN due to: %r', ex)
_is_cygwin_cache[git_executable] = is_cygwin
return is_cygwin
def get_user_id():
""":return: string identifying the currently active system user as name@node"""
return "%s@%s" % (getpass.getuser(), platform.node())
def finalize_process(proc, **kwargs):
"""Wait for the process (clone, fetch, pull or push) and handle its errors accordingly"""
## TODO: No close proc-streams??
proc.wait(**kwargs)
def expand_path(p, expand_vars=True):
try:
p = osp.expanduser(p)
if expand_vars:
p = osp.expandvars(p)
return osp.normpath(osp.abspath(p))
except Exception:
return None
#} END utilities
#{ Classes
class RemoteProgress(object):
"""
Handler providing an interface to parse progress information emitted by git-push
and git-fetch and to dispatch callbacks allowing subclasses to react to the progress.
"""
_num_op_codes = 9
BEGIN, END, COUNTING, COMPRESSING, WRITING, RECEIVING, RESOLVING, FINDING_SOURCES, CHECKING_OUT = \
[1 << x for x in range(_num_op_codes)]
STAGE_MASK = BEGIN | END
OP_MASK = ~STAGE_MASK
DONE_TOKEN = 'done.'
TOKEN_SEPARATOR = ', '
__slots__ = ('_cur_line',
'_seen_ops',
'error_lines', # Lines that started with 'error:' or 'fatal:'.
'other_lines') # Lines not denoting progress (i.e.g. push-infos).
re_op_absolute = re.compile(r"(remote: )?([\w\s]+):\s+()(\d+)()(.*)")
re_op_relative = re.compile(r"(remote: )?([\w\s]+):\s+(\d+)% \((\d+)/(\d+)\)(.*)")
def __init__(self):
self._seen_ops = []
self._cur_line = None
self.error_lines = []
self.other_lines = []
def _parse_progress_line(self, line):
"""Parse progress information from the given line as retrieved by git-push
or git-fetch.
- Lines that do not contain progress info are stored in :attr:`other_lines`.
- Lines that seem to contain an error (i.e. start with error: or fatal:) are stored
in :attr:`error_lines`."""
# handle
# Counting objects: 4, done.
# Compressing objects: 50% (1/2)
# Compressing objects: 100% (2/2)
# Compressing objects: 100% (2/2), done.
self._cur_line = line = line.decode('utf-8') if isinstance(line, bytes) else line
if self.error_lines or self._cur_line.startswith(('error:', 'fatal:')):
self.error_lines.append(self._cur_line)
return
# find escape characters and cut them away - regex will not work with
# them as they are non-ascii. As git might expect a tty, it will send them
last_valid_index = None
for i, c in enumerate(reversed(line)):
if ord(c) < 32:
# its a slice index
last_valid_index = -i - 1
# END character was non-ascii
# END for each character in line
if last_valid_index is not None:
line = line[:last_valid_index]
# END cut away invalid part
line = line.rstrip()
cur_count, max_count = None, None
match = self.re_op_relative.match(line)
if match is None:
match = self.re_op_absolute.match(line)
if not match:
self.line_dropped(line)
self.other_lines.append(line)
return
# END could not get match
op_code = 0
_remote, op_name, _percent, cur_count, max_count, message = match.groups()
# get operation id
if op_name == "Counting objects":
op_code |= self.COUNTING
elif op_name == "Compressing objects":
op_code |= self.COMPRESSING
elif op_name == "Writing objects":
op_code |= self.WRITING
elif op_name == 'Receiving objects':
op_code |= self.RECEIVING
elif op_name == 'Resolving deltas':
op_code |= self.RESOLVING
elif op_name == 'Finding sources':
op_code |= self.FINDING_SOURCES
elif op_name == 'Checking out files':
op_code |= self.CHECKING_OUT
else:
# Note: On windows it can happen that partial lines are sent
# Hence we get something like "CompreReceiving objects", which is
# a blend of "Compressing objects" and "Receiving objects".
# This can't really be prevented, so we drop the line verbosely
# to make sure we get informed in case the process spits out new
# commands at some point.
self.line_dropped(line)
# Note: Don't add this line to the other lines, as we have to silently
# drop it
return
# END handle op code
# figure out stage
if op_code not in self._seen_ops:
self._seen_ops.append(op_code)
op_code |= self.BEGIN
# END begin opcode
if message is None:
message = ''
# END message handling
message = message.strip()
if message.endswith(self.DONE_TOKEN):
op_code |= self.END
message = message[:-len(self.DONE_TOKEN)]
# END end message handling
message = message.strip(self.TOKEN_SEPARATOR)
self.update(op_code,
cur_count and float(cur_count),
max_count and float(max_count),
message)
def new_message_handler(self):
"""
:return:
a progress handler suitable for handle_process_output(), passing lines on to this Progress
handler in a suitable format"""
def handler(line):
return self._parse_progress_line(line.rstrip())
# end
return handler
def line_dropped(self, line):
"""Called whenever a line could not be understood and was therefore dropped."""
pass
def update(self, op_code, cur_count, max_count=None, message=''):
"""Called whenever the progress changes
:param op_code:
Integer allowing to be compared against Operation IDs and stage IDs.
Stage IDs are BEGIN and END. BEGIN will only be set once for each Operation
ID as well as END. It may be that BEGIN and END are set at once in case only
one progress message was emitted due to the speed of the operation.
Between BEGIN and END, none of these flags will be set
Operation IDs are all held within the OP_MASK. Only one Operation ID will
be active per call.
:param cur_count: Current absolute count of items
:param max_count:
The maximum count of items we expect. It may be None in case there is
no maximum number of items or if it is (yet) unknown.
:param message:
In case of the 'WRITING' operation, it contains the amount of bytes
transferred. It may possibly be used for other purposes as well.
You may read the contents of the current line in self._cur_line"""
pass
class CallableRemoteProgress(RemoteProgress):
"""An implementation forwarding updates to any callable"""
__slots__ = ('_callable')
def __init__(self, fn):
self._callable = fn
super(CallableRemoteProgress, self).__init__()
def update(self, *args, **kwargs):
self._callable(*args, **kwargs)
class Actor(object):
"""Actors hold information about a person acting on the repository. They
can be committers and authors or anything with a name and an email as
mentioned in the git log entries."""
# PRECOMPILED REGEX
name_only_regex = re.compile(r'<(.*)>')
name_email_regex = re.compile(r'(.*) <(.*?)>')
# ENVIRONMENT VARIABLES
# read when creating new commits
env_author_name = "GIT_AUTHOR_NAME"
env_author_email = "GIT_AUTHOR_EMAIL"
env_committer_name = "GIT_COMMITTER_NAME"
env_committer_email = "GIT_COMMITTER_EMAIL"
# CONFIGURATION KEYS
conf_name = 'name'
conf_email = 'email'
__slots__ = ('name', 'email')
def __init__(self, name, email):
self.name = name
self.email = email
def __eq__(self, other):
return self.name == other.name and self.email == other.email
def __ne__(self, other):
return not (self == other)
def __hash__(self):
return hash((self.name, self.email))
def __str__(self):
return self.name
def __repr__(self):
return '<git.Actor "%s <%s>">' % (self.name, self.email)
@classmethod
def _from_string(cls, string):
"""Create an Actor from a string.
:param string: is the string, which is expected to be in regular git format
John Doe <jdoe@example.com>
:return: Actor """
m = cls.name_email_regex.search(string)
if m:
name, email = m.groups()
return Actor(name, email)
else:
m = cls.name_only_regex.search(string)
if m:
return Actor(m.group(1), None)
# assume best and use the whole string as name
return Actor(string, None)
# END special case name
# END handle name/email matching
@classmethod
def _main_actor(cls, env_name, env_email, config_reader=None):
actor = Actor('', '')
user_id = None # We use this to avoid multiple calls to getpass.getuser()
def default_email():
nonlocal user_id
if not user_id:
user_id = get_user_id()
return user_id
def default_name():
return default_email().split('@')[0]
for attr, evar, cvar, default in (('name', env_name, cls.conf_name, default_name),
('email', env_email, cls.conf_email, default_email)):
try:
val = os.environ[evar]
setattr(actor, attr, val)
except KeyError:
if config_reader is not None:
setattr(actor, attr, config_reader.get_value('user', cvar, default()))
# END config-reader handling
if not getattr(actor, attr):
setattr(actor, attr, default())
# END handle name
# END for each item to retrieve
return actor
@classmethod
def committer(cls, config_reader=None):
"""
:return: Actor instance corresponding to the configured committer. It behaves
similar to the git implementation, such that the environment will override
configuration values of config_reader. If no value is set at all, it will be
generated
:param config_reader: ConfigReader to use to retrieve the values from in case
they are not set in the environment"""
return cls._main_actor(cls.env_committer_name, cls.env_committer_email, config_reader)
@classmethod
def author(cls, config_reader=None):
"""Same as committer(), but defines the main author. It may be specified in the environment,
but defaults to the committer"""
return cls._main_actor(cls.env_author_name, cls.env_author_email, config_reader)
class Stats(object):
"""
Represents stat information as presented by git at the end of a merge. It is
created from the output of a diff operation.
``Example``::
c = Commit( sha1 )
s = c.stats
s.total # full-stat-dict
s.files # dict( filepath : stat-dict )
``stat-dict``
A dictionary with the following keys and values::
deletions = number of deleted lines as int
insertions = number of inserted lines as int
lines = total number of lines changed as int, or deletions + insertions
``full-stat-dict``
In addition to the items in the stat-dict, it features additional information::
files = number of changed files as int"""
__slots__ = ("total", "files")
def __init__(self, total, files):
self.total = total
self.files = files
@classmethod
def _list_from_string(cls, repo, text):
"""Create a Stat object from output retrieved by git-diff.
:return: git.Stat"""
hsh = {'total': {'insertions': 0, 'deletions': 0, 'lines': 0, 'files': 0}, 'files': {}}
for line in text.splitlines():
(raw_insertions, raw_deletions, filename) = line.split("\t")
insertions = raw_insertions != '-' and int(raw_insertions) or 0
deletions = raw_deletions != '-' and int(raw_deletions) or 0
hsh['total']['insertions'] += insertions
hsh['total']['deletions'] += deletions
hsh['total']['lines'] += insertions + deletions
hsh['total']['files'] += 1
hsh['files'][filename.strip()] = {'insertions': insertions,
'deletions': deletions,
'lines': insertions + deletions}
return Stats(hsh['total'], hsh['files'])
class IndexFileSHA1Writer(object):
"""Wrapper around a file-like object that remembers the SHA1 of
the data written to it. It will write a sha when the stream is closed
or if the asked for explicitly using write_sha.
Only useful to the indexfile
:note: Based on the dulwich project"""
__slots__ = ("f", "sha1")
def __init__(self, f):
self.f = f
self.sha1 = make_sha(b"")
def write(self, data):
self.sha1.update(data)
return self.f.write(data)
def write_sha(self):
sha = self.sha1.digest()
self.f.write(sha)
return sha
def close(self):
sha = self.write_sha()
self.f.close()
return sha
def tell(self):
return self.f.tell()
class LockFile(object):
"""Provides methods to obtain, check for, and release a file based lock which
should be used to handle concurrent access to the same file.
As we are a utility class to be derived from, we only use protected methods.
Locks will automatically be released on destruction"""
__slots__ = ("_file_path", "_owns_lock")
def __init__(self, file_path):
self._file_path = file_path
self._owns_lock = False
def __del__(self):
self._release_lock()
def _lock_file_path(self):
""":return: Path to lockfile"""
return "%s.lock" % (self._file_path)
def _has_lock(self):
""":return: True if we have a lock and if the lockfile still exists
:raise AssertionError: if our lock-file does not exist"""
return self._owns_lock
def _obtain_lock_or_raise(self):
"""Create a lock file as flag for other instances, mark our instance as lock-holder
:raise IOError: if a lock was already present or a lock file could not be written"""
if self._has_lock():
return
lock_file = self._lock_file_path()
if osp.isfile(lock_file):
raise IOError("Lock for file %r did already exist, delete %r in case the lock is illegal" %
(self._file_path, lock_file))
try:
flags = os.O_WRONLY | os.O_CREAT | os.O_EXCL
if is_win:
flags |= os.O_SHORT_LIVED
fd = os.open(lock_file, flags, 0)
os.close(fd)
except OSError as e:
raise IOError(str(e)) from e
self._owns_lock = True
def _obtain_lock(self):
"""The default implementation will raise if a lock cannot be obtained.
Subclasses may override this method to provide a different implementation"""
return self._obtain_lock_or_raise()
def _release_lock(self):
"""Release our lock if we have one"""
if not self._has_lock():
return
# if someone removed our file beforhand, lets just flag this issue
# instead of failing, to make it more usable.
lfp = self._lock_file_path()
try:
rmfile(lfp)
except OSError:
pass
self._owns_lock = False
class BlockingLockFile(LockFile):
"""The lock file will block until a lock could be obtained, or fail after
a specified timeout.
:note: If the directory containing the lock was removed, an exception will
be raised during the blocking period, preventing hangs as the lock
can never be obtained."""
__slots__ = ("_check_interval", "_max_block_time")
def __init__(self, file_path, check_interval_s=0.3, max_block_time_s=maxsize):
"""Configure the instance
:param check_interval_s:
Period of time to sleep until the lock is checked the next time.
By default, it waits a nearly unlimited time
:param max_block_time_s: Maximum amount of seconds we may lock"""
super(BlockingLockFile, self).__init__(file_path)
self._check_interval = check_interval_s
self._max_block_time = max_block_time_s
def _obtain_lock(self):
"""This method blocks until it obtained the lock, or raises IOError if
it ran out of time or if the parent directory was not available anymore.
If this method returns, you are guaranteed to own the lock"""
starttime = time.time()
maxtime = starttime + float(self._max_block_time)
while True:
try:
super(BlockingLockFile, self)._obtain_lock()
except IOError as e:
# synity check: if the directory leading to the lockfile is not
# readable anymore, raise an exception
curtime = time.time()
if not osp.isdir(osp.dirname(self._lock_file_path())):
msg = "Directory containing the lockfile %r was not readable anymore after waiting %g seconds" % (
self._lock_file_path(), curtime - starttime)
raise IOError(msg) from e
# END handle missing directory
if curtime >= maxtime:
msg = "Waited %g seconds for lock at %r" % (maxtime - starttime, self._lock_file_path())
raise IOError(msg) from e
# END abort if we wait too long
time.sleep(self._check_interval)
else:
break
# END endless loop
class IterableList(list):
"""
List of iterable objects allowing to query an object by id or by named index::
heads = repo.heads
heads.master
heads['master']
heads[0]
It requires an id_attribute name to be set which will be queried from its
contained items to have a means for comparison.
A prefix can be specified which is to be used in case the id returned by the
items always contains a prefix that does not matter to the user, so it
can be left out."""
__slots__ = ('_id_attr', '_prefix')
def __new__(cls, id_attr, prefix=''):
return super(IterableList, cls).__new__(cls)
def __init__(self, id_attr, prefix=''):
self._id_attr = id_attr
self._prefix = prefix
def __contains__(self, attr):
# first try identity match for performance
try:
rval = list.__contains__(self, attr)
if rval:
return rval
except (AttributeError, TypeError):
pass
# END handle match
# otherwise make a full name search
try:
getattr(self, attr)
return True
except (AttributeError, TypeError):
return False
# END handle membership
def __getattr__(self, attr):
attr = self._prefix + attr
for item in self:
if getattr(item, self._id_attr) == attr:
return item
# END for each item
return list.__getattribute__(self, attr)
def __getitem__(self, index):
if isinstance(index, int):
return list.__getitem__(self, index)
try:
return getattr(self, index)
except AttributeError as e:
raise IndexError("No item found with id %r" % (self._prefix + index)) from e
# END handle getattr
def __delitem__(self, index):
delindex = index
if not isinstance(index, int):
delindex = -1
name = self._prefix + index
for i, item in enumerate(self):
if getattr(item, self._id_attr) == name:
delindex = i
break
# END search index
# END for each item
if delindex == -1:
raise IndexError("Item with name %s not found" % name)
# END handle error
# END get index to delete
list.__delitem__(self, delindex)
class Iterable(object):
"""Defines an interface for iterable items which is to assure a uniform
way to retrieve and iterate items within the git repository"""
__slots__ = ()
_id_attribute_ = "attribute that most suitably identifies your instance"
@classmethod
def list_items(cls, repo, *args, **kwargs):
"""
Find all items of this type - subclasses can specify args and kwargs differently.
If no args are given, subclasses are obliged to return all items if no additional
arguments arg given.
:note: Favor the iter_items method as it will
:return:list(Item,...) list of item instances"""
out_list = IterableList(cls._id_attribute_)
out_list.extend(cls.iter_items(repo, *args, **kwargs))
return out_list
@classmethod
def iter_items(cls, repo, *args, **kwargs):
"""For more information about the arguments, see list_items
:return: iterator yielding Items"""
raise NotImplementedError("To be implemented by Subclass")
#} END classes
class NullHandler(logging.Handler):
def emit(self, record):
pass

View File

@ -1,42 +0,0 @@
Copyright (C) 2010, 2011 Sebastian Thiel and contributors
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:
* Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution.
* Neither the name of the GitDB project nor the names of
its contributors may be used to endorse or promote products derived
from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED
TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
Additional Licenses
-------------------
The files at
gitdb/test/fixtures/packs/pack-11fdfa9e156ab73caae3b6da867192221f2089c2.idx
and
gitdb/test/fixtures/packs/pack-11fdfa9e156ab73caae3b6da867192221f2089c2.pack
are licensed under GNU GPL as part of the git source repository,
see http://en.wikipedia.org/wiki/Git_%28software%29 for more information.
They are not required for the actual operation, which is why they are not found
in the distribution package.

View File

@ -1,29 +0,0 @@
Metadata-Version: 2.1
Name: gitdb
Version: 4.0.5
Summary: Git Object Database
Home-page: https://github.com/gitpython-developers/gitdb
Author: Sebastian Thiel
Author-email: byronimo@gmail.com
License: BSD License
Platform: UNKNOWN
Classifier: Development Status :: 5 - Production/Stable
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: BSD License
Classifier: Operating System :: OS Independent
Classifier: Operating System :: POSIX
Classifier: Operating System :: Microsoft :: Windows
Classifier: Operating System :: MacOS :: MacOS X
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.4
Classifier: Programming Language :: Python :: 3.5
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Requires-Python: >=3.4
Requires-Dist: smmap (<4,>=3.0.1)
GitDB is a pure-Python git object database

View File

@ -1,57 +0,0 @@
gitdb-4.0.5.dist-info/AUTHORS,sha256=46ijraXmmXJ0JDzxWZBfYnWzTvm14KSjalf6Npemc_g,25
gitdb-4.0.5.dist-info/INSTALLER,sha256=zuuue4knoyJ-UwPPXg8fezS7VCrXJQrAP7zeNuwvFQg,4
gitdb-4.0.5.dist-info/LICENSE,sha256=79KfWWoI6IV-aOdpSlC82nKDl5LafD8EG8v_XxgAkjk,1984
gitdb-4.0.5.dist-info/METADATA,sha256=0F2rrFZjVUItol1z7FZPHTESr27x2GCOaSzPVe6bZlA,998
gitdb-4.0.5.dist-info/RECORD,,
gitdb-4.0.5.dist-info/WHEEL,sha256=g4nMs7d-Xl9-xC9XovUrsDHGXt-FT0E17Yqo92DEfvY,92
gitdb-4.0.5.dist-info/top_level.txt,sha256=ss6atT8cG4mQuAYXO6PokJ0r4Mm5cBiDbKsu2e3YHfs,6
gitdb/__init__.py,sha256=wzX__1hxa3qNF-76aX7PVwaVjClCMojwCaK_QvXrr-g,1127
gitdb/__pycache__/__init__.cpython-39.pyc,,
gitdb/__pycache__/base.cpython-39.pyc,,
gitdb/__pycache__/const.cpython-39.pyc,,
gitdb/__pycache__/exc.cpython-39.pyc,,
gitdb/__pycache__/fun.cpython-39.pyc,,
gitdb/__pycache__/pack.cpython-39.pyc,,
gitdb/__pycache__/stream.cpython-39.pyc,,
gitdb/__pycache__/typ.cpython-39.pyc,,
gitdb/__pycache__/util.cpython-39.pyc,,
gitdb/base.py,sha256=UQEnspMcsv1k47FEcPOyAtrrodtrdMq-_qoQEY00dt0,8029
gitdb/const.py,sha256=WWmEYKNDdm3J9fxYTFT_B6-QLDSMBClbz0LSBa1D1S8,90
gitdb/db/__init__.py,sha256=E9KSdtGIQzq8KHKRl2X5eOlFWglfYz7v76M5z3h1GnA,377
gitdb/db/__pycache__/__init__.cpython-39.pyc,,
gitdb/db/__pycache__/base.cpython-39.pyc,,
gitdb/db/__pycache__/git.cpython-39.pyc,,
gitdb/db/__pycache__/loose.cpython-39.pyc,,
gitdb/db/__pycache__/mem.cpython-39.pyc,,
gitdb/db/__pycache__/pack.cpython-39.pyc,,
gitdb/db/__pycache__/ref.cpython-39.pyc,,
gitdb/db/base.py,sha256=X1DVnKTVsArv89LiJJ3FVNHL3tOSE5cqVW6GBHGxA7k,8984
gitdb/db/git.py,sha256=VymmdhyOexv1voh4nUGx1hvF_7SXnGAA_c2tNghy8hg,2694
gitdb/db/loose.py,sha256=2l0lB3HAzbVU3f4Wz8QoGoQ2BDWCosAZ6eMLVV3-Lyk,8094
gitdb/db/mem.py,sha256=g8g6SJ_pHIL28Mf8PKxftdptmLa_zwW3ByRaIhnCaRY,3440
gitdb/db/pack.py,sha256=wDsxUlWEUf5TX_reyb4tOxGyjdEmAUPi0ROT3ExKfi8,7305
gitdb/db/ref.py,sha256=bxw0aFlsRlxb9-iBXWs4g0SSDWOl5qhNKNzWE8b8CSY,2659
gitdb/exc.py,sha256=QmO9u8CMzBNJdB_oL_8Zkjjk__2DIFcMzeeSjn5_zzE,1307
gitdb/fun.py,sha256=WDghBEIBYNDXkgxWZYFN3zVRg9u0SBVUqucJq3Y6CY0,23254
gitdb/pack.py,sha256=Uqwv4_D3o53y0hZ32dYa1g9hJ3ED2vIZrZsErDQiexc,39275
gitdb/stream.py,sha256=j73ShvZ3IMwRAE9aAmRA9HucAmo_mZ_qyQkHQy2hN4E,27642
gitdb/test/__init__.py,sha256=m2nfHtiAdFArkDi-o6Ek7zVPmtKprUwlNIa19ZQR6Cs,210
gitdb/test/__pycache__/__init__.cpython-39.pyc,,
gitdb/test/__pycache__/lib.cpython-39.pyc,,
gitdb/test/__pycache__/test_base.cpython-39.pyc,,
gitdb/test/__pycache__/test_example.cpython-39.pyc,,
gitdb/test/__pycache__/test_pack.cpython-39.pyc,,
gitdb/test/__pycache__/test_stream.cpython-39.pyc,,
gitdb/test/__pycache__/test_util.cpython-39.pyc,,
gitdb/test/lib.py,sha256=ARgm4cWGfgmhjKQJ0lL8TnNh39vHUb5blkOcSoJ2fbo,6047
gitdb/test/test_base.py,sha256=548K9hQBmKPH1_16gJPpJHTYrJ2TX-HtCZ7FBiHhFys,2828
gitdb/test/test_example.py,sha256=dR2WMW7BGQd8F7rLyoSIxkyUq5i4K9z7y5HxjDpaf4E,1371
gitdb/test/test_pack.py,sha256=-h52pr7CCR6XJgyFYEiePCcHStLJVaGDOwcy0RPBsps,9243
gitdb/test/test_stream.py,sha256=JYEspdVf7ecoMD5GnMS_ZM-o_ejXP9v_3tYRnC2NwXM,5767
gitdb/test/test_util.py,sha256=-KDHp_HMhP5lHqBsn7hGa1apQrXfOMIA_HfCSDcMpSg,3261
gitdb/typ.py,sha256=VVpMfwE0Hrwfg67zx3CDYtJK3kME7qOGwaSq3-7Ia34,379
gitdb/util.py,sha256=FkJPuG2lNR1JeBZSgWaATJTPFtx8NBRBHDuD0lmshUs,12351
gitdb/utils/__init__.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
gitdb/utils/__pycache__/__init__.cpython-39.pyc,,
gitdb/utils/__pycache__/encoding.cpython-39.pyc,,
gitdb/utils/encoding.py,sha256=ceZZFb86LGJ71cwW6qkq_BFquAlNE7jaafNbwxYRSXk,372

View File

@ -1,5 +0,0 @@
Wheel-Version: 1.0
Generator: bdist_wheel (0.34.2)
Root-Is-Purelib: true
Tag: py3-none-any

View File

@ -1,40 +0,0 @@
# Copyright (C) 2010, 2011 Sebastian Thiel (byronimo@gmail.com) and contributors
#
# This module is part of GitDB and is released under
# the New BSD License: http://www.opensource.org/licenses/bsd-license.php
"""Initialize the object database module"""
import sys
import os
#{ Initialization
def _init_externals():
"""Initialize external projects by putting them into the path"""
for module in ('smmap',):
if 'PYOXIDIZER' not in os.environ:
sys.path.append(os.path.join(os.path.dirname(__file__), 'ext', module))
try:
__import__(module)
except ImportError:
raise ImportError("'%s' could not be imported, assure it is located in your PYTHONPATH" % module)
# END verify import
# END handel imports
#} END initialization
_init_externals()
__author__ = "Sebastian Thiel"
__contact__ = "byronimo@gmail.com"
__homepage__ = "https://github.com/gitpython-developers/gitdb"
version_info = (4, 0, 5)
__version__ = '.'.join(str(i) for i in version_info)
# default imports
from gitdb.base import *
from gitdb.db import *
from gitdb.stream import *

View File

@ -1,315 +0,0 @@
# Copyright (C) 2010, 2011 Sebastian Thiel (byronimo@gmail.com) and contributors
#
# This module is part of GitDB and is released under
# the New BSD License: http://www.opensource.org/licenses/bsd-license.php
"""Module with basic data structures - they are designed to be lightweight and fast"""
from gitdb.util import bin_to_hex
from gitdb.fun import (
type_id_to_type_map,
type_to_type_id_map
)
__all__ = ('OInfo', 'OPackInfo', 'ODeltaPackInfo',
'OStream', 'OPackStream', 'ODeltaPackStream',
'IStream', 'InvalidOInfo', 'InvalidOStream')
#{ ODB Bases
class OInfo(tuple):
"""Carries information about an object in an ODB, providing information
about the binary sha of the object, the type_string as well as the uncompressed size
in bytes.
It can be accessed using tuple notation and using attribute access notation::
assert dbi[0] == dbi.binsha
assert dbi[1] == dbi.type
assert dbi[2] == dbi.size
The type is designed to be as lightweight as possible."""
__slots__ = tuple()
def __new__(cls, sha, type, size):
return tuple.__new__(cls, (sha, type, size))
def __init__(self, *args):
tuple.__init__(self)
#{ Interface
@property
def binsha(self):
""":return: our sha as binary, 20 bytes"""
return self[0]
@property
def hexsha(self):
""":return: our sha, hex encoded, 40 bytes"""
return bin_to_hex(self[0])
@property
def type(self):
return self[1]
@property
def type_id(self):
return type_to_type_id_map[self[1]]
@property
def size(self):
return self[2]
#} END interface
class OPackInfo(tuple):
"""As OInfo, but provides a type_id property to retrieve the numerical type id, and
does not include a sha.
Additionally, the pack_offset is the absolute offset into the packfile at which
all object information is located. The data_offset property points to the absolute
location in the pack at which that actual data stream can be found."""
__slots__ = tuple()
def __new__(cls, packoffset, type, size):
return tuple.__new__(cls, (packoffset, type, size))
def __init__(self, *args):
tuple.__init__(self)
#{ Interface
@property
def pack_offset(self):
return self[0]
@property
def type(self):
return type_id_to_type_map[self[1]]
@property
def type_id(self):
return self[1]
@property
def size(self):
return self[2]
#} END interface
class ODeltaPackInfo(OPackInfo):
"""Adds delta specific information,
Either the 20 byte sha which points to some object in the database,
or the negative offset from the pack_offset, so that pack_offset - delta_info yields
the pack offset of the base object"""
__slots__ = tuple()
def __new__(cls, packoffset, type, size, delta_info):
return tuple.__new__(cls, (packoffset, type, size, delta_info))
#{ Interface
@property
def delta_info(self):
return self[3]
#} END interface
class OStream(OInfo):
"""Base for object streams retrieved from the database, providing additional
information about the stream.
Generally, ODB streams are read-only as objects are immutable"""
__slots__ = tuple()
def __new__(cls, sha, type, size, stream, *args, **kwargs):
"""Helps with the initialization of subclasses"""
return tuple.__new__(cls, (sha, type, size, stream))
def __init__(self, *args, **kwargs):
tuple.__init__(self)
#{ Stream Reader Interface
def read(self, size=-1):
return self[3].read(size)
@property
def stream(self):
return self[3]
#} END stream reader interface
class ODeltaStream(OStream):
"""Uses size info of its stream, delaying reads"""
def __new__(cls, sha, type, size, stream, *args, **kwargs):
"""Helps with the initialization of subclasses"""
return tuple.__new__(cls, (sha, type, size, stream))
#{ Stream Reader Interface
@property
def size(self):
return self[3].size
#} END stream reader interface
class OPackStream(OPackInfo):
"""Next to pack object information, a stream outputting an undeltified base object
is provided"""
__slots__ = tuple()
def __new__(cls, packoffset, type, size, stream, *args):
"""Helps with the initialization of subclasses"""
return tuple.__new__(cls, (packoffset, type, size, stream))
#{ Stream Reader Interface
def read(self, size=-1):
return self[3].read(size)
@property
def stream(self):
return self[3]
#} END stream reader interface
class ODeltaPackStream(ODeltaPackInfo):
"""Provides a stream outputting the uncompressed offset delta information"""
__slots__ = tuple()
def __new__(cls, packoffset, type, size, delta_info, stream):
return tuple.__new__(cls, (packoffset, type, size, delta_info, stream))
#{ Stream Reader Interface
def read(self, size=-1):
return self[4].read(size)
@property
def stream(self):
return self[4]
#} END stream reader interface
class IStream(list):
"""Represents an input content stream to be fed into the ODB. It is mutable to allow
the ODB to record information about the operations outcome right in this instance.
It provides interfaces for the OStream and a StreamReader to allow the instance
to blend in without prior conversion.
The only method your content stream must support is 'read'"""
__slots__ = tuple()
def __new__(cls, type, size, stream, sha=None):
return list.__new__(cls, (sha, type, size, stream, None))
def __init__(self, type, size, stream, sha=None):
list.__init__(self, (sha, type, size, stream, None))
#{ Interface
@property
def hexsha(self):
""":return: our sha, hex encoded, 40 bytes"""
return bin_to_hex(self[0])
def _error(self):
""":return: the error that occurred when processing the stream, or None"""
return self[4]
def _set_error(self, exc):
"""Set this input stream to the given exc, may be None to reset the error"""
self[4] = exc
error = property(_error, _set_error)
#} END interface
#{ Stream Reader Interface
def read(self, size=-1):
"""Implements a simple stream reader interface, passing the read call on
to our internal stream"""
return self[3].read(size)
#} END stream reader interface
#{ interface
def _set_binsha(self, binsha):
self[0] = binsha
def _binsha(self):
return self[0]
binsha = property(_binsha, _set_binsha)
def _type(self):
return self[1]
def _set_type(self, type):
self[1] = type
type = property(_type, _set_type)
def _size(self):
return self[2]
def _set_size(self, size):
self[2] = size
size = property(_size, _set_size)
def _stream(self):
return self[3]
def _set_stream(self, stream):
self[3] = stream
stream = property(_stream, _set_stream)
#} END odb info interface
class InvalidOInfo(tuple):
"""Carries information about a sha identifying an object which is invalid in
the queried database. The exception attribute provides more information about
the cause of the issue"""
__slots__ = tuple()
def __new__(cls, sha, exc):
return tuple.__new__(cls, (sha, exc))
def __init__(self, sha, exc):
tuple.__init__(self, (sha, exc))
@property
def binsha(self):
return self[0]
@property
def hexsha(self):
return bin_to_hex(self[0])
@property
def error(self):
""":return: exception instance explaining the failure"""
return self[1]
class InvalidOStream(InvalidOInfo):
"""Carries information about an invalid ODB stream"""
__slots__ = tuple()
#} END ODB Bases

View File

@ -1,4 +0,0 @@
BYTE_SPACE = b' '
NULL_BYTE = b'\0'
NULL_HEX_SHA = "0" * 40
NULL_BIN_SHA = NULL_BYTE * 20

View File

@ -1,11 +0,0 @@
# Copyright (C) 2010, 2011 Sebastian Thiel (byronimo@gmail.com) and contributors
#
# This module is part of GitDB and is released under
# the New BSD License: http://www.opensource.org/licenses/bsd-license.php
from gitdb.db.base import *
from gitdb.db.loose import *
from gitdb.db.mem import *
from gitdb.db.pack import *
from gitdb.db.git import *
from gitdb.db.ref import *

View File

@ -1,273 +0,0 @@
# Copyright (C) 2010, 2011 Sebastian Thiel (byronimo@gmail.com) and contributors
#
# This module is part of GitDB and is released under
# the New BSD License: http://www.opensource.org/licenses/bsd-license.php
"""Contains implementations of database retrieveing objects"""
from gitdb.util import (
join,
LazyMixin,
hex_to_bin
)
from gitdb.utils.encoding import force_text
from gitdb.exc import (
BadObject,
AmbiguousObjectName
)
from itertools import chain
from functools import reduce
__all__ = ('ObjectDBR', 'ObjectDBW', 'FileDBBase', 'CompoundDB', 'CachingDB')
class ObjectDBR(object):
"""Defines an interface for object database lookup.
Objects are identified either by their 20 byte bin sha"""
def __contains__(self, sha):
return self.has_obj
#{ Query Interface
def has_object(self, sha):
"""
:return: True if the object identified by the given 20 bytes
binary sha is contained in the database"""
raise NotImplementedError("To be implemented in subclass")
def info(self, sha):
""" :return: OInfo instance
:param sha: bytes binary sha
:raise BadObject:"""
raise NotImplementedError("To be implemented in subclass")
def stream(self, sha):
""":return: OStream instance
:param sha: 20 bytes binary sha
:raise BadObject:"""
raise NotImplementedError("To be implemented in subclass")
def size(self):
""":return: amount of objects in this database"""
raise NotImplementedError()
def sha_iter(self):
"""Return iterator yielding 20 byte shas for all objects in this data base"""
raise NotImplementedError()
#} END query interface
class ObjectDBW(object):
"""Defines an interface to create objects in the database"""
def __init__(self, *args, **kwargs):
self._ostream = None
#{ Edit Interface
def set_ostream(self, stream):
"""
Adjusts the stream to which all data should be sent when storing new objects
:param stream: if not None, the stream to use, if None the default stream
will be used.
:return: previously installed stream, or None if there was no override
:raise TypeError: if the stream doesn't have the supported functionality"""
cstream = self._ostream
self._ostream = stream
return cstream
def ostream(self):
"""
:return: overridden output stream this instance will write to, or None
if it will write to the default stream"""
return self._ostream
def store(self, istream):
"""
Create a new object in the database
:return: the input istream object with its sha set to its corresponding value
:param istream: IStream compatible instance. If its sha is already set
to a value, the object will just be stored in the our database format,
in which case the input stream is expected to be in object format ( header + contents ).
:raise IOError: if data could not be written"""
raise NotImplementedError("To be implemented in subclass")
#} END edit interface
class FileDBBase(object):
"""Provides basic facilities to retrieve files of interest, including
caching facilities to help mapping hexsha's to objects"""
def __init__(self, root_path):
"""Initialize this instance to look for its files at the given root path
All subsequent operations will be relative to this path
:raise InvalidDBRoot:
**Note:** The base will not perform any accessablity checking as the base
might not yet be accessible, but become accessible before the first
access."""
super(FileDBBase, self).__init__()
self._root_path = root_path
#{ Interface
def root_path(self):
""":return: path at which this db operates"""
return self._root_path
def db_path(self, rela_path):
"""
:return: the given relative path relative to our database root, allowing
to pontentially access datafiles"""
return join(self._root_path, force_text(rela_path))
#} END interface
class CachingDB(object):
"""A database which uses caches to speed-up access"""
#{ Interface
def update_cache(self, force=False):
"""
Call this method if the underlying data changed to trigger an update
of the internal caching structures.
:param force: if True, the update must be performed. Otherwise the implementation
may decide not to perform an update if it thinks nothing has changed.
:return: True if an update was performed as something change indeed"""
# END interface
def _databases_recursive(database, output):
"""Fill output list with database from db, in order. Deals with Loose, Packed
and compound databases."""
if isinstance(database, CompoundDB):
dbs = database.databases()
output.extend(db for db in dbs if not isinstance(db, CompoundDB))
for cdb in (db for db in dbs if isinstance(db, CompoundDB)):
_databases_recursive(cdb, output)
else:
output.append(database)
# END handle database type
class CompoundDB(ObjectDBR, LazyMixin, CachingDB):
"""A database which delegates calls to sub-databases.
Databases are stored in the lazy-loaded _dbs attribute.
Define _set_cache_ to update it with your databases"""
def _set_cache_(self, attr):
if attr == '_dbs':
self._dbs = list()
elif attr == '_db_cache':
self._db_cache = dict()
else:
super(CompoundDB, self)._set_cache_(attr)
def _db_query(self, sha):
""":return: database containing the given 20 byte sha
:raise BadObject:"""
# most databases use binary representations, prevent converting
# it every time a database is being queried
try:
return self._db_cache[sha]
except KeyError:
pass
# END first level cache
for db in self._dbs:
if db.has_object(sha):
self._db_cache[sha] = db
return db
# END for each database
raise BadObject(sha)
#{ ObjectDBR interface
def has_object(self, sha):
try:
self._db_query(sha)
return True
except BadObject:
return False
# END handle exceptions
def info(self, sha):
return self._db_query(sha).info(sha)
def stream(self, sha):
return self._db_query(sha).stream(sha)
def size(self):
""":return: total size of all contained databases"""
return reduce(lambda x, y: x + y, (db.size() for db in self._dbs), 0)
def sha_iter(self):
return chain(*(db.sha_iter() for db in self._dbs))
#} END object DBR Interface
#{ Interface
def databases(self):
""":return: tuple of database instances we use for lookups"""
return tuple(self._dbs)
def update_cache(self, force=False):
# something might have changed, clear everything
self._db_cache.clear()
stat = False
for db in self._dbs:
if isinstance(db, CachingDB):
stat |= db.update_cache(force)
# END if is caching db
# END for each database to update
return stat
def partial_to_complete_sha_hex(self, partial_hexsha):
"""
:return: 20 byte binary sha1 from the given less-than-40 byte hexsha (bytes or str)
:param partial_hexsha: hexsha with less than 40 byte
:raise AmbiguousObjectName: """
databases = list()
_databases_recursive(self, databases)
partial_hexsha = force_text(partial_hexsha)
len_partial_hexsha = len(partial_hexsha)
if len_partial_hexsha % 2 != 0:
partial_binsha = hex_to_bin(partial_hexsha + "0")
else:
partial_binsha = hex_to_bin(partial_hexsha)
# END assure successful binary conversion
candidate = None
for db in databases:
full_bin_sha = None
try:
if hasattr(db, 'partial_to_complete_sha_hex'):
full_bin_sha = db.partial_to_complete_sha_hex(partial_hexsha)
else:
full_bin_sha = db.partial_to_complete_sha(partial_binsha, len_partial_hexsha)
# END handle database type
except BadObject:
continue
# END ignore bad objects
if full_bin_sha:
if candidate and candidate != full_bin_sha:
raise AmbiguousObjectName(partial_hexsha)
candidate = full_bin_sha
# END handle candidate
# END for each db
if not candidate:
raise BadObject(partial_binsha)
return candidate
#} END interface

View File

@ -1,85 +0,0 @@
# Copyright (C) 2010, 2011 Sebastian Thiel (byronimo@gmail.com) and contributors
#
# This module is part of GitDB and is released under
# the New BSD License: http://www.opensource.org/licenses/bsd-license.php
from gitdb.db.base import (
CompoundDB,
ObjectDBW,
FileDBBase
)
from gitdb.db.loose import LooseObjectDB
from gitdb.db.pack import PackedDB
from gitdb.db.ref import ReferenceDB
from gitdb.exc import InvalidDBRoot
import os
__all__ = ('GitDB', )
class GitDB(FileDBBase, ObjectDBW, CompoundDB):
"""A git-style object database, which contains all objects in the 'objects'
subdirectory
``IMPORTANT``: The usage of this implementation is highly discouraged as it fails to release file-handles.
This can be a problem with long-running processes and/or big repositories.
"""
# Configuration
PackDBCls = PackedDB
LooseDBCls = LooseObjectDB
ReferenceDBCls = ReferenceDB
# Directories
packs_dir = 'pack'
loose_dir = ''
alternates_dir = os.path.join('info', 'alternates')
def __init__(self, root_path):
"""Initialize ourselves on a git objects directory"""
super(GitDB, self).__init__(root_path)
def _set_cache_(self, attr):
if attr == '_dbs' or attr == '_loose_db':
self._dbs = list()
loose_db = None
for subpath, dbcls in ((self.packs_dir, self.PackDBCls),
(self.loose_dir, self.LooseDBCls),
(self.alternates_dir, self.ReferenceDBCls)):
path = self.db_path(subpath)
if os.path.exists(path):
self._dbs.append(dbcls(path))
if dbcls is self.LooseDBCls:
loose_db = self._dbs[-1]
# END remember loose db
# END check path exists
# END for each db type
# should have at least one subdb
if not self._dbs:
raise InvalidDBRoot(self.root_path())
# END handle error
# we the first one should have the store method
assert loose_db is not None and hasattr(loose_db, 'store'), "First database needs store functionality"
# finally set the value
self._loose_db = loose_db
else:
super(GitDB, self)._set_cache_(attr)
# END handle attrs
#{ ObjectDBW interface
def store(self, istream):
return self._loose_db.store(istream)
def ostream(self):
return self._loose_db.ostream()
def set_ostream(self, ostream):
return self._loose_db.set_ostream(ostream)
#} END objectdbw interface

View File

@ -1,258 +0,0 @@
# Copyright (C) 2010, 2011 Sebastian Thiel (byronimo@gmail.com) and contributors
#
# This module is part of GitDB and is released under
# the New BSD License: http://www.opensource.org/licenses/bsd-license.php
from gitdb.db.base import (
FileDBBase,
ObjectDBR,
ObjectDBW
)
from gitdb.exc import (
BadObject,
AmbiguousObjectName
)
from gitdb.stream import (
DecompressMemMapReader,
FDCompressedSha1Writer,
FDStream,
Sha1Writer
)
from gitdb.base import (
OStream,
OInfo
)
from gitdb.util import (
file_contents_ro_filepath,
ENOENT,
hex_to_bin,
bin_to_hex,
exists,
chmod,
isdir,
isfile,
remove,
mkdir,
rename,
dirname,
basename,
join
)
from gitdb.fun import (
chunk_size,
loose_object_header_info,
write_object,
stream_copy
)
from gitdb.utils.encoding import force_bytes
import tempfile
import os
import sys
__all__ = ('LooseObjectDB', )
class LooseObjectDB(FileDBBase, ObjectDBR, ObjectDBW):
"""A database which operates on loose object files"""
# CONFIGURATION
# chunks in which data will be copied between streams
stream_chunk_size = chunk_size
# On windows we need to keep it writable, otherwise it cannot be removed
# either
new_objects_mode = int("444", 8)
if os.name == 'nt':
new_objects_mode = int("644", 8)
def __init__(self, root_path):
super(LooseObjectDB, self).__init__(root_path)
self._hexsha_to_file = dict()
# Additional Flags - might be set to 0 after the first failure
# Depending on the root, this might work for some mounts, for others not, which
# is why it is per instance
self._fd_open_flags = getattr(os, 'O_NOATIME', 0)
#{ Interface
def object_path(self, hexsha):
"""
:return: path at which the object with the given hexsha would be stored,
relative to the database root"""
return join(hexsha[:2], hexsha[2:])
def readable_db_object_path(self, hexsha):
"""
:return: readable object path to the object identified by hexsha
:raise BadObject: If the object file does not exist"""
try:
return self._hexsha_to_file[hexsha]
except KeyError:
pass
# END ignore cache misses
# try filesystem
path = self.db_path(self.object_path(hexsha))
if exists(path):
self._hexsha_to_file[hexsha] = path
return path
# END handle cache
raise BadObject(hexsha)
def partial_to_complete_sha_hex(self, partial_hexsha):
""":return: 20 byte binary sha1 string which matches the given name uniquely
:param name: hexadecimal partial name (bytes or ascii string)
:raise AmbiguousObjectName:
:raise BadObject: """
candidate = None
for binsha in self.sha_iter():
if bin_to_hex(binsha).startswith(force_bytes(partial_hexsha)):
# it can't ever find the same object twice
if candidate is not None:
raise AmbiguousObjectName(partial_hexsha)
candidate = binsha
# END for each object
if candidate is None:
raise BadObject(partial_hexsha)
return candidate
#} END interface
def _map_loose_object(self, sha):
"""
:return: memory map of that file to allow random read access
:raise BadObject: if object could not be located"""
db_path = self.db_path(self.object_path(bin_to_hex(sha)))
try:
return file_contents_ro_filepath(db_path, flags=self._fd_open_flags)
except OSError as e:
if e.errno != ENOENT:
# try again without noatime
try:
return file_contents_ro_filepath(db_path)
except OSError:
raise BadObject(sha)
# didn't work because of our flag, don't try it again
self._fd_open_flags = 0
else:
raise BadObject(sha)
# END handle error
# END exception handling
def set_ostream(self, stream):
""":raise TypeError: if the stream does not support the Sha1Writer interface"""
if stream is not None and not isinstance(stream, Sha1Writer):
raise TypeError("Output stream musst support the %s interface" % Sha1Writer.__name__)
return super(LooseObjectDB, self).set_ostream(stream)
def info(self, sha):
m = self._map_loose_object(sha)
try:
typ, size = loose_object_header_info(m)
return OInfo(sha, typ, size)
finally:
if hasattr(m, 'close'):
m.close()
# END assure release of system resources
def stream(self, sha):
m = self._map_loose_object(sha)
type, size, stream = DecompressMemMapReader.new(m, close_on_deletion=True)
return OStream(sha, type, size, stream)
def has_object(self, sha):
try:
self.readable_db_object_path(bin_to_hex(sha))
return True
except BadObject:
return False
# END check existence
def store(self, istream):
"""note: The sha we produce will be hex by nature"""
tmp_path = None
writer = self.ostream()
if writer is None:
# open a tmp file to write the data to
fd, tmp_path = tempfile.mkstemp(prefix='obj', dir=self._root_path)
if istream.binsha is None:
writer = FDCompressedSha1Writer(fd)
else:
writer = FDStream(fd)
# END handle direct stream copies
# END handle custom writer
try:
try:
if istream.binsha is not None:
# copy as much as possible, the actual uncompressed item size might
# be smaller than the compressed version
stream_copy(istream.read, writer.write, sys.maxsize, self.stream_chunk_size)
else:
# write object with header, we have to make a new one
write_object(istream.type, istream.size, istream.read, writer.write,
chunk_size=self.stream_chunk_size)
# END handle direct stream copies
finally:
if tmp_path:
writer.close()
# END assure target stream is closed
except:
if tmp_path:
os.remove(tmp_path)
raise
# END assure tmpfile removal on error
hexsha = None
if istream.binsha:
hexsha = istream.hexsha
else:
hexsha = writer.sha(as_hex=True)
# END handle sha
if tmp_path:
obj_path = self.db_path(self.object_path(hexsha))
obj_dir = dirname(obj_path)
if not isdir(obj_dir):
mkdir(obj_dir)
# END handle destination directory
# rename onto existing doesn't work on NTFS
if isfile(obj_path):
remove(tmp_path)
else:
rename(tmp_path, obj_path)
# end rename only if needed
# make sure its readable for all ! It started out as rw-- tmp file
# but needs to be rwrr
chmod(obj_path, self.new_objects_mode)
# END handle dry_run
istream.binsha = hex_to_bin(hexsha)
return istream
def sha_iter(self):
# find all files which look like an object, extract sha from there
for root, dirs, files in os.walk(self.root_path()):
root_base = basename(root)
if len(root_base) != 2:
continue
for f in files:
if len(f) != 38:
continue
yield hex_to_bin(root_base + f)
# END for each file
# END for each walk iteration
def size(self):
return len(tuple(self.sha_iter()))

View File

@ -1,112 +0,0 @@
# Copyright (C) 2010, 2011 Sebastian Thiel (byronimo@gmail.com) and contributors
#
# This module is part of GitDB and is released under
# the New BSD License: http://www.opensource.org/licenses/bsd-license.php
"""Contains the MemoryDatabase implementation"""
from gitdb.db.loose import LooseObjectDB
from gitdb.db.base import (
ObjectDBR,
ObjectDBW
)
from gitdb.base import (
OStream,
IStream,
)
from gitdb.exc import (
BadObject,
UnsupportedOperation
)
from gitdb.stream import (
ZippedStoreShaWriter,
DecompressMemMapReader,
)
from io import BytesIO
__all__ = ("MemoryDB", )
class MemoryDB(ObjectDBR, ObjectDBW):
"""A memory database stores everything to memory, providing fast IO and object
retrieval. It should be used to buffer results and obtain SHAs before writing
it to the actual physical storage, as it allows to query whether object already
exists in the target storage before introducing actual IO"""
def __init__(self):
super(MemoryDB, self).__init__()
self._db = LooseObjectDB("path/doesnt/matter")
# maps 20 byte shas to their OStream objects
self._cache = dict()
def set_ostream(self, stream):
raise UnsupportedOperation("MemoryDB's always stream into memory")
def store(self, istream):
zstream = ZippedStoreShaWriter()
self._db.set_ostream(zstream)
istream = self._db.store(istream)
zstream.close() # close to flush
zstream.seek(0)
# don't provide a size, the stream is written in object format, hence the
# header needs decompression
decomp_stream = DecompressMemMapReader(zstream.getvalue(), close_on_deletion=False)
self._cache[istream.binsha] = OStream(istream.binsha, istream.type, istream.size, decomp_stream)
return istream
def has_object(self, sha):
return sha in self._cache
def info(self, sha):
# we always return streams, which are infos as well
return self.stream(sha)
def stream(self, sha):
try:
ostream = self._cache[sha]
# rewind stream for the next one to read
ostream.stream.seek(0)
return ostream
except KeyError:
raise BadObject(sha)
# END exception handling
def size(self):
return len(self._cache)
def sha_iter(self):
try:
return self._cache.iterkeys()
except AttributeError:
return self._cache.keys()
#{ Interface
def stream_copy(self, sha_iter, odb):
"""Copy the streams as identified by sha's yielded by sha_iter into the given odb
The streams will be copied directly
**Note:** the object will only be written if it did not exist in the target db
:return: amount of streams actually copied into odb. If smaller than the amount
of input shas, one or more objects did already exist in odb"""
count = 0
for sha in sha_iter:
if odb.has_object(sha):
continue
# END check object existence
ostream = self.stream(sha)
# compressed data including header
sio = BytesIO(ostream.stream.data())
istream = IStream(ostream.type, ostream.size, sio, sha)
odb.store(istream)
count += 1
# END for each sha
return count
#} END interface

View File

@ -1,206 +0,0 @@
# Copyright (C) 2010, 2011 Sebastian Thiel (byronimo@gmail.com) and contributors
#
# This module is part of GitDB and is released under
# the New BSD License: http://www.opensource.org/licenses/bsd-license.php
"""Module containing a database to deal with packs"""
from gitdb.db.base import (
FileDBBase,
ObjectDBR,
CachingDB
)
from gitdb.util import LazyMixin
from gitdb.exc import (
BadObject,
UnsupportedOperation,
AmbiguousObjectName
)
from gitdb.pack import PackEntity
from functools import reduce
import os
import glob
__all__ = ('PackedDB', )
#{ Utilities
class PackedDB(FileDBBase, ObjectDBR, CachingDB, LazyMixin):
"""A database operating on a set of object packs"""
# sort the priority list every N queries
# Higher values are better, performance tests don't show this has
# any effect, but it should have one
_sort_interval = 500
def __init__(self, root_path):
super(PackedDB, self).__init__(root_path)
# list of lists with three items:
# * hits - number of times the pack was hit with a request
# * entity - Pack entity instance
# * sha_to_index - PackIndexFile.sha_to_index method for direct cache query
# self._entities = list() # lazy loaded list
self._hit_count = 0 # amount of hits
self._st_mtime = 0 # last modification data of our root path
def _set_cache_(self, attr):
if attr == '_entities':
self._entities = list()
self.update_cache(force=True)
# END handle entities initialization
def _sort_entities(self):
self._entities.sort(key=lambda l: l[0], reverse=True)
def _pack_info(self, sha):
""":return: tuple(entity, index) for an item at the given sha
:param sha: 20 or 40 byte sha
:raise BadObject:
**Note:** This method is not thread-safe, but may be hit in multi-threaded
operation. The worst thing that can happen though is a counter that
was not incremented, or the list being in wrong order. So we safe
the time for locking here, lets see how that goes"""
# presort ?
if self._hit_count % self._sort_interval == 0:
self._sort_entities()
# END update sorting
for item in self._entities:
index = item[2](sha)
if index is not None:
item[0] += 1 # one hit for you
self._hit_count += 1 # general hit count
return (item[1], index)
# END index found in pack
# END for each item
# no hit, see whether we have to update packs
# NOTE: considering packs don't change very often, we safe this call
# and leave it to the super-caller to trigger that
raise BadObject(sha)
#{ Object DB Read
def has_object(self, sha):
try:
self._pack_info(sha)
return True
except BadObject:
return False
# END exception handling
def info(self, sha):
entity, index = self._pack_info(sha)
return entity.info_at_index(index)
def stream(self, sha):
entity, index = self._pack_info(sha)
return entity.stream_at_index(index)
def sha_iter(self):
for entity in self.entities():
index = entity.index()
sha_by_index = index.sha
for index in range(index.size()):
yield sha_by_index(index)
# END for each index
# END for each entity
def size(self):
sizes = [item[1].index().size() for item in self._entities]
return reduce(lambda x, y: x + y, sizes, 0)
#} END object db read
#{ object db write
def store(self, istream):
"""Storing individual objects is not feasible as a pack is designed to
hold multiple objects. Writing or rewriting packs for single objects is
inefficient"""
raise UnsupportedOperation()
#} END object db write
#{ Interface
def update_cache(self, force=False):
"""
Update our cache with the acutally existing packs on disk. Add new ones,
and remove deleted ones. We keep the unchanged ones
:param force: If True, the cache will be updated even though the directory
does not appear to have changed according to its modification timestamp.
:return: True if the packs have been updated so there is new information,
False if there was no change to the pack database"""
stat = os.stat(self.root_path())
if not force and stat.st_mtime <= self._st_mtime:
return False
# END abort early on no change
self._st_mtime = stat.st_mtime
# packs are supposed to be prefixed with pack- by git-convention
# get all pack files, figure out what changed
pack_files = set(glob.glob(os.path.join(self.root_path(), "pack-*.pack")))
our_pack_files = {item[1].pack().path() for item in self._entities}
# new packs
for pack_file in (pack_files - our_pack_files):
# init the hit-counter/priority with the size, a good measure for hit-
# probability. Its implemented so that only 12 bytes will be read
entity = PackEntity(pack_file)
self._entities.append([entity.pack().size(), entity, entity.index().sha_to_index])
# END for each new packfile
# removed packs
for pack_file in (our_pack_files - pack_files):
del_index = -1
for i, item in enumerate(self._entities):
if item[1].pack().path() == pack_file:
del_index = i
break
# END found index
# END for each entity
assert del_index != -1
del(self._entities[del_index])
# END for each removed pack
# reinitialize prioritiess
self._sort_entities()
return True
def entities(self):
""":return: list of pack entities operated upon by this database"""
return [item[1] for item in self._entities]
def partial_to_complete_sha(self, partial_binsha, canonical_length):
""":return: 20 byte sha as inferred by the given partial binary sha
:param partial_binsha: binary sha with less than 20 bytes
:param canonical_length: length of the corresponding canonical representation.
It is required as binary sha's cannot display whether the original hex sha
had an odd or even number of characters
:raise AmbiguousObjectName:
:raise BadObject: """
candidate = None
for item in self._entities:
item_index = item[1].index().partial_sha_to_index(partial_binsha, canonical_length)
if item_index is not None:
sha = item[1].index().sha(item_index)
if candidate and candidate != sha:
raise AmbiguousObjectName(partial_binsha)
candidate = sha
# END handle full sha could be found
# END for each entity
if candidate:
return candidate
# still not found ?
raise BadObject(partial_binsha)
#} END interface

View File

@ -1,82 +0,0 @@
# Copyright (C) 2010, 2011 Sebastian Thiel (byronimo@gmail.com) and contributors
#
# This module is part of GitDB and is released under
# the New BSD License: http://www.opensource.org/licenses/bsd-license.php
import codecs
from gitdb.db.base import (
CompoundDB,
)
__all__ = ('ReferenceDB', )
class ReferenceDB(CompoundDB):
"""A database consisting of database referred to in a file"""
# Configuration
# Specifies the object database to use for the paths found in the alternates
# file. If None, it defaults to the GitDB
ObjectDBCls = None
def __init__(self, ref_file):
super(ReferenceDB, self).__init__()
self._ref_file = ref_file
def _set_cache_(self, attr):
if attr == '_dbs':
self._dbs = list()
self._update_dbs_from_ref_file()
else:
super(ReferenceDB, self)._set_cache_(attr)
# END handle attrs
def _update_dbs_from_ref_file(self):
dbcls = self.ObjectDBCls
if dbcls is None:
# late import
from gitdb.db.git import GitDB
dbcls = GitDB
# END get db type
# try to get as many as possible, don't fail if some are unavailable
ref_paths = list()
try:
with codecs.open(self._ref_file, 'r', encoding="utf-8") as f:
ref_paths = [l.strip() for l in f]
except (OSError, IOError):
pass
# END handle alternates
ref_paths_set = set(ref_paths)
cur_ref_paths_set = {db.root_path() for db in self._dbs}
# remove existing
for path in (cur_ref_paths_set - ref_paths_set):
for i, db in enumerate(self._dbs[:]):
if db.root_path() == path:
del(self._dbs[i])
continue
# END del matching db
# END for each path to remove
# add new
# sort them to maintain order
added_paths = sorted(ref_paths_set - cur_ref_paths_set, key=lambda p: ref_paths.index(p))
for path in added_paths:
try:
db = dbcls(path)
# force an update to verify path
if isinstance(db, CompoundDB):
db.databases()
# END verification
self._dbs.append(db)
except Exception:
# ignore invalid paths or issues
pass
# END for each path to add
def update_cache(self, force=False):
# re-read alternates and update databases
self._update_dbs_from_ref_file()
return super(ReferenceDB, self).update_cache(force)

View File

@ -1,46 +0,0 @@
# Copyright (C) 2010, 2011 Sebastian Thiel (byronimo@gmail.com) and contributors
#
# This module is part of GitDB and is released under
# the New BSD License: http://www.opensource.org/licenses/bsd-license.php
"""Module with common exceptions"""
from gitdb.util import to_hex_sha
class ODBError(Exception):
"""All errors thrown by the object database"""
class InvalidDBRoot(ODBError):
"""Thrown if an object database cannot be initialized at the given path"""
class BadObject(ODBError):
"""The object with the given SHA does not exist. Instantiate with the
failed sha"""
def __str__(self):
return "BadObject: %s" % to_hex_sha(self.args[0])
class BadName(ODBError):
"""A name provided to rev_parse wasn't understood"""
def __str__(self):
return "Ref '%s' did not resolve to an object" % self.args[0]
class ParseError(ODBError):
"""Thrown if the parsing of a file failed due to an invalid format"""
class AmbiguousObjectName(ODBError):
"""Thrown if a possibly shortened name does not uniquely represent a single object
in the database"""
class BadObjectType(ODBError):
"""The object had an unsupported type"""
class UnsupportedOperation(ODBError):
"""Thrown if the given operation cannot be supported by the object database"""

View File

@ -1,704 +0,0 @@
# Copyright (C) 2010, 2011 Sebastian Thiel (byronimo@gmail.com) and contributors
#
# This module is part of GitDB and is released under
# the New BSD License: http://www.opensource.org/licenses/bsd-license.php
"""Contains basic c-functions which usually contain performance critical code
Keeping this code separate from the beginning makes it easier to out-source
it into c later, if required"""
import zlib
from gitdb.util import byte_ord
decompressobj = zlib.decompressobj
import mmap
from itertools import islice
from functools import reduce
from gitdb.const import NULL_BYTE, BYTE_SPACE
from gitdb.utils.encoding import force_text
from gitdb.typ import (
str_blob_type,
str_commit_type,
str_tree_type,
str_tag_type,
)
from io import StringIO
# INVARIANTS
OFS_DELTA = 6
REF_DELTA = 7
delta_types = (OFS_DELTA, REF_DELTA)
type_id_to_type_map = {
0: b'', # EXT 1
1: str_commit_type,
2: str_tree_type,
3: str_blob_type,
4: str_tag_type,
5: b'', # EXT 2
OFS_DELTA: "OFS_DELTA", # OFFSET DELTA
REF_DELTA: "REF_DELTA" # REFERENCE DELTA
}
type_to_type_id_map = {
str_commit_type: 1,
str_tree_type: 2,
str_blob_type: 3,
str_tag_type: 4,
"OFS_DELTA": OFS_DELTA,
"REF_DELTA": REF_DELTA,
}
# used when dealing with larger streams
chunk_size = 1000 * mmap.PAGESIZE
__all__ = ('is_loose_object', 'loose_object_header_info', 'msb_size', 'pack_object_header_info',
'write_object', 'loose_object_header', 'stream_copy', 'apply_delta_data',
'is_equal_canonical_sha', 'connect_deltas', 'DeltaChunkList', 'create_pack_object_header')
#{ Structures
def _set_delta_rbound(d, size):
"""Truncate the given delta to the given size
:param size: size relative to our target offset, may not be 0, must be smaller or equal
to our size
:return: d"""
d.ts = size
# NOTE: data is truncated automatically when applying the delta
# MUST NOT DO THIS HERE
return d
def _move_delta_lbound(d, bytes):
"""Move the delta by the given amount of bytes, reducing its size so that its
right bound stays static
:param bytes: amount of bytes to move, must be smaller than delta size
:return: d"""
if bytes == 0:
return
d.to += bytes
d.so += bytes
d.ts -= bytes
if d.data is not None:
d.data = d.data[bytes:]
# END handle data
return d
def delta_duplicate(src):
return DeltaChunk(src.to, src.ts, src.so, src.data)
def delta_chunk_apply(dc, bbuf, write):
"""Apply own data to the target buffer
:param bbuf: buffer providing source bytes for copy operations
:param write: write method to call with data to write"""
if dc.data is None:
# COPY DATA FROM SOURCE
write(bbuf[dc.so:dc.so + dc.ts])
else:
# APPEND DATA
# whats faster: if + 4 function calls or just a write with a slice ?
# Considering data can be larger than 127 bytes now, it should be worth it
if dc.ts < len(dc.data):
write(dc.data[:dc.ts])
else:
write(dc.data)
# END handle truncation
# END handle chunk mode
class DeltaChunk(object):
"""Represents a piece of a delta, it can either add new data, or copy existing
one from a source buffer"""
__slots__ = (
'to', # start offset in the target buffer in bytes
'ts', # size of this chunk in the target buffer in bytes
'so', # start offset in the source buffer in bytes or None
'data', # chunk of bytes to be added to the target buffer,
# DeltaChunkList to use as base, or None
)
def __init__(self, to, ts, so, data):
self.to = to
self.ts = ts
self.so = so
self.data = data
def __repr__(self):
return "DeltaChunk(%i, %i, %s, %s)" % (self.to, self.ts, self.so, self.data or "")
#{ Interface
def rbound(self):
return self.to + self.ts
def has_data(self):
""":return: True if the instance has data to add to the target stream"""
return self.data is not None
#} END interface
def _closest_index(dcl, absofs):
""":return: index at which the given absofs should be inserted. The index points
to the DeltaChunk with a target buffer absofs that equals or is greater than
absofs.
**Note:** global method for performance only, it belongs to DeltaChunkList"""
lo = 0
hi = len(dcl)
while lo < hi:
mid = (lo + hi) / 2
dc = dcl[mid]
if dc.to > absofs:
hi = mid
elif dc.rbound() > absofs or dc.to == absofs:
return mid
else:
lo = mid + 1
# END handle bound
# END for each delta absofs
return len(dcl) - 1
def delta_list_apply(dcl, bbuf, write):
"""Apply the chain's changes and write the final result using the passed
write function.
:param bbuf: base buffer containing the base of all deltas contained in this
list. It will only be used if the chunk in question does not have a base
chain.
:param write: function taking a string of bytes to write to the output"""
for dc in dcl:
delta_chunk_apply(dc, bbuf, write)
# END for each dc
def delta_list_slice(dcl, absofs, size, ndcl):
""":return: Subsection of this list at the given absolute offset, with the given
size in bytes.
:return: None"""
cdi = _closest_index(dcl, absofs) # delta start index
cd = dcl[cdi]
slen = len(dcl)
lappend = ndcl.append
if cd.to != absofs:
tcd = DeltaChunk(cd.to, cd.ts, cd.so, cd.data)
_move_delta_lbound(tcd, absofs - cd.to)
tcd.ts = min(tcd.ts, size)
lappend(tcd)
size -= tcd.ts
cdi += 1
# END lbound overlap handling
while cdi < slen and size:
# are we larger than the current block
cd = dcl[cdi]
if cd.ts <= size:
lappend(DeltaChunk(cd.to, cd.ts, cd.so, cd.data))
size -= cd.ts
else:
tcd = DeltaChunk(cd.to, cd.ts, cd.so, cd.data)
tcd.ts = size
lappend(tcd)
size -= tcd.ts
break
# END hadle size
cdi += 1
# END for each chunk
class DeltaChunkList(list):
"""List with special functionality to deal with DeltaChunks.
There are two types of lists we represent. The one was created bottom-up, working
towards the latest delta, the other kind was created top-down, working from the
latest delta down to the earliest ancestor. This attribute is queryable
after all processing with is_reversed."""
__slots__ = tuple()
def rbound(self):
""":return: rightmost extend in bytes, absolute"""
if len(self) == 0:
return 0
return self[-1].rbound()
def lbound(self):
""":return: leftmost byte at which this chunklist starts"""
if len(self) == 0:
return 0
return self[0].to
def size(self):
""":return: size of bytes as measured by our delta chunks"""
return self.rbound() - self.lbound()
def apply(self, bbuf, write):
"""Only used by public clients, internally we only use the global routines
for performance"""
return delta_list_apply(self, bbuf, write)
def compress(self):
"""Alter the list to reduce the amount of nodes. Currently we concatenate
add-chunks
:return: self"""
slen = len(self)
if slen < 2:
return self
i = 0
first_data_index = None
while i < slen:
dc = self[i]
i += 1
if dc.data is None:
if first_data_index is not None and i - 2 - first_data_index > 1:
# if first_data_index is not None:
nd = StringIO() # new data
so = self[first_data_index].to # start offset in target buffer
for x in range(first_data_index, i - 1):
xdc = self[x]
nd.write(xdc.data[:xdc.ts])
# END collect data
del(self[first_data_index:i - 1])
buf = nd.getvalue()
self.insert(first_data_index, DeltaChunk(so, len(buf), 0, buf))
slen = len(self)
i = first_data_index + 1
# END concatenate data
first_data_index = None
continue
# END skip non-data chunks
if first_data_index is None:
first_data_index = i - 1
# END iterate list
# if slen_orig != len(self):
# print "INFO: Reduced delta list len to %f %% of former size" % ((float(len(self)) / slen_orig) * 100)
return self
def check_integrity(self, target_size=-1):
"""Verify the list has non-overlapping chunks only, and the total size matches
target_size
:param target_size: if not -1, the total size of the chain must be target_size
:raise AssertionError: if the size doen't match"""
if target_size > -1:
assert self[-1].rbound() == target_size
assert reduce(lambda x, y: x + y, (d.ts for d in self), 0) == target_size
# END target size verification
if len(self) < 2:
return
# check data
for dc in self:
assert dc.ts > 0
if dc.has_data():
assert len(dc.data) >= dc.ts
# END for each dc
left = islice(self, 0, len(self) - 1)
right = iter(self)
right.next()
# this is very pythonic - we might have just use index based access here,
# but this could actually be faster
for lft, rgt in zip(left, right):
assert lft.rbound() == rgt.to
assert lft.to + lft.ts == rgt.to
# END for each pair
class TopdownDeltaChunkList(DeltaChunkList):
"""Represents a list which is generated by feeding its ancestor streams one by
one"""
__slots__ = tuple()
def connect_with_next_base(self, bdcl):
"""Connect this chain with the next level of our base delta chunklist.
The goal in this game is to mark as many of our chunks rigid, hence they
cannot be changed by any of the upcoming bases anymore. Once all our
chunks are marked like that, we can stop all processing
:param bdcl: data chunk list being one of our bases. They must be fed in
consequtively and in order, towards the earliest ancestor delta
:return: True if processing was done. Use it to abort processing of
remaining streams if False is returned"""
nfc = 0 # number of frozen chunks
dci = 0 # delta chunk index
slen = len(self) # len of self
ccl = list() # temporary list
while dci < slen:
dc = self[dci]
dci += 1
# all add-chunks which are already topmost don't need additional processing
if dc.data is not None:
nfc += 1
continue
# END skip add chunks
# copy chunks
# integrate the portion of the base list into ourselves. Lists
# dont support efficient insertion ( just one at a time ), but for now
# we live with it. Internally, its all just a 32/64bit pointer, and
# the portions of moved memory should be smallish. Maybe we just rebuild
# ourselves in order to reduce the amount of insertions ...
del(ccl[:])
delta_list_slice(bdcl, dc.so, dc.ts, ccl)
# move the target bounds into place to match with our chunk
ofs = dc.to - dc.so
for cdc in ccl:
cdc.to += ofs
# END update target bounds
if len(ccl) == 1:
self[dci - 1] = ccl[0]
else:
# maybe try to compute the expenses here, and pick the right algorithm
# It would normally be faster than copying everything physically though
# TODO: Use a deque here, and decide by the index whether to extend
# or extend left !
post_dci = self[dci:]
del(self[dci - 1:]) # include deletion of dc
self.extend(ccl)
self.extend(post_dci)
slen = len(self)
dci += len(ccl) - 1 # deleted dc, added rest
# END handle chunk replacement
# END for each chunk
if nfc == slen:
return False
# END handle completeness
return True
#} END structures
#{ Routines
def is_loose_object(m):
"""
:return: True the file contained in memory map m appears to be a loose object.
Only the first two bytes are needed"""
b0, b1 = map(ord, m[:2])
word = (b0 << 8) + b1
return b0 == 0x78 and (word % 31) == 0
def loose_object_header_info(m):
"""
:return: tuple(type_string, uncompressed_size_in_bytes) the type string of the
object as well as its uncompressed size in bytes.
:param m: memory map from which to read the compressed object data"""
decompress_size = 8192 # is used in cgit as well
hdr = decompressobj().decompress(m, decompress_size)
type_name, size = hdr[:hdr.find(NULL_BYTE)].split(BYTE_SPACE)
return type_name, int(size)
def pack_object_header_info(data):
"""
:return: tuple(type_id, uncompressed_size_in_bytes, byte_offset)
The type_id should be interpreted according to the ``type_id_to_type_map`` map
The byte-offset specifies the start of the actual zlib compressed datastream
:param m: random-access memory, like a string or memory map"""
c = byte_ord(data[0]) # first byte
i = 1 # next char to read
type_id = (c >> 4) & 7 # numeric type
size = c & 15 # starting size
s = 4 # starting bit-shift size
while c & 0x80:
c = byte_ord(data[i])
i += 1
size += (c & 0x7f) << s
s += 7
# END character loop
# end performance at expense of maintenance ...
return (type_id, size, i)
def create_pack_object_header(obj_type, obj_size):
"""
:return: string defining the pack header comprised of the object type
and its incompressed size in bytes
:param obj_type: pack type_id of the object
:param obj_size: uncompressed size in bytes of the following object stream"""
c = 0 # 1 byte
hdr = bytearray() # output string
c = (obj_type << 4) | (obj_size & 0xf)
obj_size >>= 4
while obj_size:
hdr.append(c | 0x80)
c = obj_size & 0x7f
obj_size >>= 7
# END until size is consumed
hdr.append(c)
# end handle interpreter
return hdr
def msb_size(data, offset=0):
"""
:return: tuple(read_bytes, size) read the msb size from the given random
access data starting at the given byte offset"""
size = 0
i = 0
l = len(data)
hit_msb = False
while i < l:
c = data[i + offset]
size |= (c & 0x7f) << i * 7
i += 1
if not c & 0x80:
hit_msb = True
break
# END check msb bit
# END while in range
# end performance ...
if not hit_msb:
raise AssertionError("Could not find terminating MSB byte in data stream")
return i + offset, size
def loose_object_header(type, size):
"""
:return: bytes representing the loose object header, which is immediately
followed by the content stream of size 'size'"""
return ('%s %i\0' % (force_text(type), size)).encode('ascii')
def write_object(type, size, read, write, chunk_size=chunk_size):
"""
Write the object as identified by type, size and source_stream into the
target_stream
:param type: type string of the object
:param size: amount of bytes to write from source_stream
:param read: read method of a stream providing the content data
:param write: write method of the output stream
:param close_target_stream: if True, the target stream will be closed when
the routine exits, even if an error is thrown
:return: The actual amount of bytes written to stream, which includes the header and a trailing newline"""
tbw = 0 # total num bytes written
# WRITE HEADER: type SP size NULL
tbw += write(loose_object_header(type, size))
tbw += stream_copy(read, write, size, chunk_size)
return tbw
def stream_copy(read, write, size, chunk_size):
"""
Copy a stream up to size bytes using the provided read and write methods,
in chunks of chunk_size
**Note:** its much like stream_copy utility, but operates just using methods"""
dbw = 0 # num data bytes written
# WRITE ALL DATA UP TO SIZE
while True:
cs = min(chunk_size, size - dbw)
# NOTE: not all write methods return the amount of written bytes, like
# mmap.write. Its bad, but we just deal with it ... perhaps its not
# even less efficient
# data_len = write(read(cs))
# dbw += data_len
data = read(cs)
data_len = len(data)
dbw += data_len
write(data)
if data_len < cs or dbw == size:
break
# END check for stream end
# END duplicate data
return dbw
def connect_deltas(dstreams):
"""
Read the condensed delta chunk information from dstream and merge its information
into a list of existing delta chunks
:param dstreams: iterable of delta stream objects, the delta to be applied last
comes first, then all its ancestors in order
:return: DeltaChunkList, containing all operations to apply"""
tdcl = None # topmost dcl
dcl = tdcl = TopdownDeltaChunkList()
for dsi, ds in enumerate(dstreams):
# print "Stream", dsi
db = ds.read()
delta_buf_size = ds.size
# read header
i, base_size = msb_size(db)
i, target_size = msb_size(db, i)
# interpret opcodes
tbw = 0 # amount of target bytes written
while i < delta_buf_size:
c = ord(db[i])
i += 1
if c & 0x80:
cp_off, cp_size = 0, 0
if (c & 0x01):
cp_off = ord(db[i])
i += 1
if (c & 0x02):
cp_off |= (ord(db[i]) << 8)
i += 1
if (c & 0x04):
cp_off |= (ord(db[i]) << 16)
i += 1
if (c & 0x08):
cp_off |= (ord(db[i]) << 24)
i += 1
if (c & 0x10):
cp_size = ord(db[i])
i += 1
if (c & 0x20):
cp_size |= (ord(db[i]) << 8)
i += 1
if (c & 0x40):
cp_size |= (ord(db[i]) << 16)
i += 1
if not cp_size:
cp_size = 0x10000
rbound = cp_off + cp_size
if (rbound < cp_size or
rbound > base_size):
break
dcl.append(DeltaChunk(tbw, cp_size, cp_off, None))
tbw += cp_size
elif c:
# NOTE: in C, the data chunks should probably be concatenated here.
# In python, we do it as a post-process
dcl.append(DeltaChunk(tbw, c, 0, db[i:i + c]))
i += c
tbw += c
else:
raise ValueError("unexpected delta opcode 0")
# END handle command byte
# END while processing delta data
dcl.compress()
# merge the lists !
if dsi > 0:
if not tdcl.connect_with_next_base(dcl):
break
# END handle merge
# prepare next base
dcl = DeltaChunkList()
# END for each delta stream
return tdcl
def apply_delta_data(src_buf, src_buf_size, delta_buf, delta_buf_size, write):
"""
Apply data from a delta buffer using a source buffer to the target file
:param src_buf: random access data from which the delta was created
:param src_buf_size: size of the source buffer in bytes
:param delta_buf_size: size fo the delta buffer in bytes
:param delta_buf: random access delta data
:param write: write method taking a chunk of bytes
**Note:** transcribed to python from the similar routine in patch-delta.c"""
i = 0
db = delta_buf
while i < delta_buf_size:
c = db[i]
i += 1
if c & 0x80:
cp_off, cp_size = 0, 0
if (c & 0x01):
cp_off = db[i]
i += 1
if (c & 0x02):
cp_off |= (db[i] << 8)
i += 1
if (c & 0x04):
cp_off |= (db[i] << 16)
i += 1
if (c & 0x08):
cp_off |= (db[i] << 24)
i += 1
if (c & 0x10):
cp_size = db[i]
i += 1
if (c & 0x20):
cp_size |= (db[i] << 8)
i += 1
if (c & 0x40):
cp_size |= (db[i] << 16)
i += 1
if not cp_size:
cp_size = 0x10000
rbound = cp_off + cp_size
if (rbound < cp_size or
rbound > src_buf_size):
break
write(src_buf[cp_off:cp_off + cp_size])
elif c:
write(db[i:i + c])
i += c
else:
raise ValueError("unexpected delta opcode 0")
# END handle command byte
# END while processing delta data
# yes, lets use the exact same error message that git uses :)
assert i == delta_buf_size, "delta replay has gone wild"
def is_equal_canonical_sha(canonical_length, match, sha1):
"""
:return: True if the given lhs and rhs 20 byte binary shas
The comparison will take the canonical_length of the match sha into account,
hence the comparison will only use the last 4 bytes for uneven canonical representations
:param match: less than 20 byte sha
:param sha1: 20 byte sha"""
binary_length = canonical_length // 2
if match[:binary_length] != sha1[:binary_length]:
return False
if canonical_length - binary_length and \
(byte_ord(match[-1]) ^ byte_ord(sha1[len(match) - 1])) & 0xf0:
return False
# END handle uneven canonnical length
return True
#} END routines
try:
from gitdb_speedups._perf import connect_deltas
except ImportError:
pass

View File

@ -1,731 +0,0 @@
# Copyright (C) 2010, 2011 Sebastian Thiel (byronimo@gmail.com) and contributors
#
# This module is part of GitDB and is released under
# the New BSD License: http://www.opensource.org/licenses/bsd-license.php
from io import BytesIO
import mmap
import os
import sys
import zlib
from gitdb.fun import (
msb_size,
stream_copy,
apply_delta_data,
connect_deltas,
delta_types
)
from gitdb.util import (
allocate_memory,
LazyMixin,
make_sha,
write,
close,
)
from gitdb.const import NULL_BYTE, BYTE_SPACE
from gitdb.utils.encoding import force_bytes
has_perf_mod = False
PY26 = sys.version_info[:2] < (2, 7)
try:
from gitdb_speedups._perf import apply_delta as c_apply_delta
has_perf_mod = True
except ImportError:
pass
__all__ = ('DecompressMemMapReader', 'FDCompressedSha1Writer', 'DeltaApplyReader',
'Sha1Writer', 'FlexibleSha1Writer', 'ZippedStoreShaWriter', 'FDCompressedSha1Writer',
'FDStream', 'NullStream')
#{ RO Streams
class DecompressMemMapReader(LazyMixin):
"""Reads data in chunks from a memory map and decompresses it. The client sees
only the uncompressed data, respective file-like read calls are handling on-demand
buffered decompression accordingly
A constraint on the total size of bytes is activated, simulating
a logical file within a possibly larger physical memory area
To read efficiently, you clearly don't want to read individual bytes, instead,
read a few kilobytes at least.
**Note:** The chunk-size should be carefully selected as it will involve quite a bit
of string copying due to the way the zlib is implemented. Its very wasteful,
hence we try to find a good tradeoff between allocation time and number of
times we actually allocate. An own zlib implementation would be good here
to better support streamed reading - it would only need to keep the mmap
and decompress it into chunks, that's all ... """
__slots__ = ('_m', '_zip', '_buf', '_buflen', '_br', '_cws', '_cwe', '_s', '_close',
'_cbr', '_phi')
max_read_size = 512 * 1024 # currently unused
def __init__(self, m, close_on_deletion, size=None):
"""Initialize with mmap for stream reading
:param m: must be content data - use new if you have object data and no size"""
self._m = m
self._zip = zlib.decompressobj()
self._buf = None # buffer of decompressed bytes
self._buflen = 0 # length of bytes in buffer
if size is not None:
self._s = size # size of uncompressed data to read in total
self._br = 0 # num uncompressed bytes read
self._cws = 0 # start byte of compression window
self._cwe = 0 # end byte of compression window
self._cbr = 0 # number of compressed bytes read
self._phi = False # is True if we parsed the header info
self._close = close_on_deletion # close the memmap on deletion ?
def _set_cache_(self, attr):
assert attr == '_s'
# only happens for size, which is a marker to indicate we still
# have to parse the header from the stream
self._parse_header_info()
def __del__(self):
self.close()
def _parse_header_info(self):
"""If this stream contains object data, parse the header info and skip the
stream to a point where each read will yield object content
:return: parsed type_string, size"""
# read header
# should really be enough, cgit uses 8192 I believe
# And for good reason !! This needs to be that high for the header to be read correctly in all cases
maxb = 8192
self._s = maxb
hdr = self.read(maxb)
hdrend = hdr.find(NULL_BYTE)
typ, size = hdr[:hdrend].split(BYTE_SPACE)
size = int(size)
self._s = size
# adjust internal state to match actual header length that we ignore
# The buffer will be depleted first on future reads
self._br = 0
hdrend += 1
self._buf = BytesIO(hdr[hdrend:])
self._buflen = len(hdr) - hdrend
self._phi = True
return typ, size
#{ Interface
@classmethod
def new(self, m, close_on_deletion=False):
"""Create a new DecompressMemMapReader instance for acting as a read-only stream
This method parses the object header from m and returns the parsed
type and size, as well as the created stream instance.
:param m: memory map on which to operate. It must be object data ( header + contents )
:param close_on_deletion: if True, the memory map will be closed once we are
being deleted"""
inst = DecompressMemMapReader(m, close_on_deletion, 0)
typ, size = inst._parse_header_info()
return typ, size, inst
def data(self):
""":return: random access compatible data we are working on"""
return self._m
def close(self):
"""Close our underlying stream of compressed bytes if this was allowed during initialization
:return: True if we closed the underlying stream
:note: can be called safely
"""
if self._close:
if hasattr(self._m, 'close'):
self._m.close()
self._close = False
# END handle resource freeing
def compressed_bytes_read(self):
"""
:return: number of compressed bytes read. This includes the bytes it
took to decompress the header ( if there was one )"""
# ABSTRACT: When decompressing a byte stream, it can be that the first
# x bytes which were requested match the first x bytes in the loosely
# compressed datastream. This is the worst-case assumption that the reader
# does, it assumes that it will get at least X bytes from X compressed bytes
# in call cases.
# The caveat is that the object, according to our known uncompressed size,
# is already complete, but there are still some bytes left in the compressed
# stream that contribute to the amount of compressed bytes.
# How can we know that we are truly done, and have read all bytes we need
# to read ?
# Without help, we cannot know, as we need to obtain the status of the
# decompression. If it is not finished, we need to decompress more data
# until it is finished, to yield the actual number of compressed bytes
# belonging to the decompressed object
# We are using a custom zlib module for this, if its not present,
# we try to put in additional bytes up for decompression if feasible
# and check for the unused_data.
# Only scrub the stream forward if we are officially done with the
# bytes we were to have.
if self._br == self._s and not self._zip.unused_data:
# manipulate the bytes-read to allow our own read method to continue
# but keep the window at its current position
self._br = 0
if hasattr(self._zip, 'status'):
while self._zip.status == zlib.Z_OK:
self.read(mmap.PAGESIZE)
# END scrub-loop custom zlib
else:
# pass in additional pages, until we have unused data
while not self._zip.unused_data and self._cbr != len(self._m):
self.read(mmap.PAGESIZE)
# END scrub-loop default zlib
# END handle stream scrubbing
# reset bytes read, just to be sure
self._br = self._s
# END handle stream scrubbing
# unused data ends up in the unconsumed tail, which was removed
# from the count already
return self._cbr
#} END interface
def seek(self, offset, whence=getattr(os, 'SEEK_SET', 0)):
"""Allows to reset the stream to restart reading
:raise ValueError: If offset and whence are not 0"""
if offset != 0 or whence != getattr(os, 'SEEK_SET', 0):
raise ValueError("Can only seek to position 0")
# END handle offset
self._zip = zlib.decompressobj()
self._br = self._cws = self._cwe = self._cbr = 0
if self._phi:
self._phi = False
del(self._s) # trigger header parsing on first access
# END skip header
def read(self, size=-1):
if size < 1:
size = self._s - self._br
else:
size = min(size, self._s - self._br)
# END clamp size
if size == 0:
return bytes()
# END handle depletion
# deplete the buffer, then just continue using the decompress object
# which has an own buffer. We just need this to transparently parse the
# header from the zlib stream
dat = bytes()
if self._buf:
if self._buflen >= size:
# have enough data
dat = self._buf.read(size)
self._buflen -= size
self._br += size
return dat
else:
dat = self._buf.read() # ouch, duplicates data
size -= self._buflen
self._br += self._buflen
self._buflen = 0
self._buf = None
# END handle buffer len
# END handle buffer
# decompress some data
# Abstract: zlib needs to operate on chunks of our memory map ( which may
# be large ), as it will otherwise and always fill in the 'unconsumed_tail'
# attribute which possible reads our whole map to the end, forcing
# everything to be read from disk even though just a portion was requested.
# As this would be a nogo, we workaround it by passing only chunks of data,
# moving the window into the memory map along as we decompress, which keeps
# the tail smaller than our chunk-size. This causes 'only' the chunk to be
# copied once, and another copy of a part of it when it creates the unconsumed
# tail. We have to use it to hand in the appropriate amount of bytes during
# the next read.
tail = self._zip.unconsumed_tail
if tail:
# move the window, make it as large as size demands. For code-clarity,
# we just take the chunk from our map again instead of reusing the unconsumed
# tail. The latter one would safe some memory copying, but we could end up
# with not getting enough data uncompressed, so we had to sort that out as well.
# Now we just assume the worst case, hence the data is uncompressed and the window
# needs to be as large as the uncompressed bytes we want to read.
self._cws = self._cwe - len(tail)
self._cwe = self._cws + size
else:
cws = self._cws
self._cws = self._cwe
self._cwe = cws + size
# END handle tail
# if window is too small, make it larger so zip can decompress something
if self._cwe - self._cws < 8:
self._cwe = self._cws + 8
# END adjust winsize
# takes a slice, but doesn't copy the data, it says ...
indata = self._m[self._cws:self._cwe]
# get the actual window end to be sure we don't use it for computations
self._cwe = self._cws + len(indata)
dcompdat = self._zip.decompress(indata, size)
# update the amount of compressed bytes read
# We feed possibly overlapping chunks, which is why the unconsumed tail
# has to be taken into consideration, as well as the unused data
# if we hit the end of the stream
# NOTE: Behavior changed in PY2.7 onward, which requires special handling to make the tests work properly.
# They are thorough, and I assume it is truly working.
# Why is this logic as convoluted as it is ? Please look at the table in
# https://github.com/gitpython-developers/gitdb/issues/19 to learn about the test-results.
# Bascially, on py2.6, you want to use branch 1, whereas on all other python version, the second branch
# will be the one that works.
# However, the zlib VERSIONs as well as the platform check is used to further match the entries in the
# table in the github issue. This is it ... it was the only way I could make this work everywhere.
# IT's CERTAINLY GOING TO BITE US IN THE FUTURE ... .
if PY26 or ((zlib.ZLIB_VERSION == '1.2.7' or zlib.ZLIB_VERSION == '1.2.5') and not sys.platform == 'darwin'):
unused_datalen = len(self._zip.unconsumed_tail)
else:
unused_datalen = len(self._zip.unconsumed_tail) + len(self._zip.unused_data)
# # end handle very special case ...
self._cbr += len(indata) - unused_datalen
self._br += len(dcompdat)
if dat:
dcompdat = dat + dcompdat
# END prepend our cached data
# it can happen, depending on the compression, that we get less bytes
# than ordered as it needs the final portion of the data as well.
# Recursively resolve that.
# Note: dcompdat can be empty even though we still appear to have bytes
# to read, if we are called by compressed_bytes_read - it manipulates
# us to empty the stream
if dcompdat and (len(dcompdat) - len(dat)) < size and self._br < self._s:
dcompdat += self.read(size - len(dcompdat))
# END handle special case
return dcompdat
class DeltaApplyReader(LazyMixin):
"""A reader which dynamically applies pack deltas to a base object, keeping the
memory demands to a minimum.
The size of the final object is only obtainable once all deltas have been
applied, unless it is retrieved from a pack index.
The uncompressed Delta has the following layout (MSB being a most significant
bit encoded dynamic size):
* MSB Source Size - the size of the base against which the delta was created
* MSB Target Size - the size of the resulting data after the delta was applied
* A list of one byte commands (cmd) which are followed by a specific protocol:
* cmd & 0x80 - copy delta_data[offset:offset+size]
* Followed by an encoded offset into the delta data
* Followed by an encoded size of the chunk to copy
* cmd & 0x7f - insert
* insert cmd bytes from the delta buffer into the output stream
* cmd == 0 - invalid operation ( or error in delta stream )
"""
__slots__ = (
"_bstream", # base stream to which to apply the deltas
"_dstreams", # tuple of delta stream readers
"_mm_target", # memory map of the delta-applied data
"_size", # actual number of bytes in _mm_target
"_br" # number of bytes read
)
#{ Configuration
k_max_memory_move = 250 * 1000 * 1000
#} END configuration
def __init__(self, stream_list):
"""Initialize this instance with a list of streams, the first stream being
the delta to apply on top of all following deltas, the last stream being the
base object onto which to apply the deltas"""
assert len(stream_list) > 1, "Need at least one delta and one base stream"
self._bstream = stream_list[-1]
self._dstreams = tuple(stream_list[:-1])
self._br = 0
def _set_cache_too_slow_without_c(self, attr):
# the direct algorithm is fastest and most direct if there is only one
# delta. Also, the extra overhead might not be worth it for items smaller
# than X - definitely the case in python, every function call costs
# huge amounts of time
# if len(self._dstreams) * self._bstream.size < self.k_max_memory_move:
if len(self._dstreams) == 1:
return self._set_cache_brute_(attr)
# Aggregate all deltas into one delta in reverse order. Hence we take
# the last delta, and reverse-merge its ancestor delta, until we receive
# the final delta data stream.
dcl = connect_deltas(self._dstreams)
# call len directly, as the (optional) c version doesn't implement the sequence
# protocol
if dcl.rbound() == 0:
self._size = 0
self._mm_target = allocate_memory(0)
return
# END handle empty list
self._size = dcl.rbound()
self._mm_target = allocate_memory(self._size)
bbuf = allocate_memory(self._bstream.size)
stream_copy(self._bstream.read, bbuf.write, self._bstream.size, 256 * mmap.PAGESIZE)
# APPLY CHUNKS
write = self._mm_target.write
dcl.apply(bbuf, write)
self._mm_target.seek(0)
def _set_cache_brute_(self, attr):
"""If we are here, we apply the actual deltas"""
# TODO: There should be a special case if there is only one stream
# Then the default-git algorithm should perform a tad faster, as the
# delta is not peaked into, causing less overhead.
buffer_info_list = list()
max_target_size = 0
for dstream in self._dstreams:
buf = dstream.read(512) # read the header information + X
offset, src_size = msb_size(buf)
offset, target_size = msb_size(buf, offset)
buffer_info_list.append((buf[offset:], offset, src_size, target_size))
max_target_size = max(max_target_size, target_size)
# END for each delta stream
# sanity check - the first delta to apply should have the same source
# size as our actual base stream
base_size = self._bstream.size
target_size = max_target_size
# if we have more than 1 delta to apply, we will swap buffers, hence we must
# assure that all buffers we use are large enough to hold all the results
if len(self._dstreams) > 1:
base_size = target_size = max(base_size, max_target_size)
# END adjust buffer sizes
# Allocate private memory map big enough to hold the first base buffer
# We need random access to it
bbuf = allocate_memory(base_size)
stream_copy(self._bstream.read, bbuf.write, base_size, 256 * mmap.PAGESIZE)
# allocate memory map large enough for the largest (intermediate) target
# We will use it as scratch space for all delta ops. If the final
# target buffer is smaller than our allocated space, we just use parts
# of it upon return.
tbuf = allocate_memory(target_size)
# for each delta to apply, memory map the decompressed delta and
# work on the op-codes to reconstruct everything.
# For the actual copying, we use a seek and write pattern of buffer
# slices.
final_target_size = None
for (dbuf, offset, src_size, target_size), dstream in zip(reversed(buffer_info_list), reversed(self._dstreams)):
# allocate a buffer to hold all delta data - fill in the data for
# fast access. We do this as we know that reading individual bytes
# from our stream would be slower than necessary ( although possible )
# The dbuf buffer contains commands after the first two MSB sizes, the
# offset specifies the amount of bytes read to get the sizes.
ddata = allocate_memory(dstream.size - offset)
ddata.write(dbuf)
# read the rest from the stream. The size we give is larger than necessary
stream_copy(dstream.read, ddata.write, dstream.size, 256 * mmap.PAGESIZE)
#######################################################################
if 'c_apply_delta' in globals():
c_apply_delta(bbuf, ddata, tbuf)
else:
apply_delta_data(bbuf, src_size, ddata, len(ddata), tbuf.write)
#######################################################################
# finally, swap out source and target buffers. The target is now the
# base for the next delta to apply
bbuf, tbuf = tbuf, bbuf
bbuf.seek(0)
tbuf.seek(0)
final_target_size = target_size
# END for each delta to apply
# its already seeked to 0, constrain it to the actual size
# NOTE: in the end of the loop, it swaps buffers, hence our target buffer
# is not tbuf, but bbuf !
self._mm_target = bbuf
self._size = final_target_size
#{ Configuration
if not has_perf_mod:
_set_cache_ = _set_cache_brute_
else:
_set_cache_ = _set_cache_too_slow_without_c
#} END configuration
def read(self, count=0):
bl = self._size - self._br # bytes left
if count < 1 or count > bl:
count = bl
# NOTE: we could check for certain size limits, and possibly
# return buffers instead of strings to prevent byte copying
data = self._mm_target.read(count)
self._br += len(data)
return data
def seek(self, offset, whence=getattr(os, 'SEEK_SET', 0)):
"""Allows to reset the stream to restart reading
:raise ValueError: If offset and whence are not 0"""
if offset != 0 or whence != getattr(os, 'SEEK_SET', 0):
raise ValueError("Can only seek to position 0")
# END handle offset
self._br = 0
self._mm_target.seek(0)
#{ Interface
@classmethod
def new(cls, stream_list):
"""
Convert the given list of streams into a stream which resolves deltas
when reading from it.
:param stream_list: two or more stream objects, first stream is a Delta
to the object that you want to resolve, followed by N additional delta
streams. The list's last stream must be a non-delta stream.
:return: Non-Delta OPackStream object whose stream can be used to obtain
the decompressed resolved data
:raise ValueError: if the stream list cannot be handled"""
if len(stream_list) < 2:
raise ValueError("Need at least two streams")
# END single object special handling
if stream_list[-1].type_id in delta_types:
raise ValueError(
"Cannot resolve deltas if there is no base object stream, last one was type: %s" % stream_list[-1].type)
# END check stream
return cls(stream_list)
#} END interface
#{ OInfo like Interface
@property
def type(self):
return self._bstream.type
@property
def type_id(self):
return self._bstream.type_id
@property
def size(self):
""":return: number of uncompressed bytes in the stream"""
return self._size
#} END oinfo like interface
#} END RO streams
#{ W Streams
class Sha1Writer(object):
"""Simple stream writer which produces a sha whenever you like as it degests
everything it is supposed to write"""
__slots__ = "sha1"
def __init__(self):
self.sha1 = make_sha()
#{ Stream Interface
def write(self, data):
""":raise IOError: If not all bytes could be written
:param data: byte object
:return: length of incoming data"""
self.sha1.update(data)
return len(data)
# END stream interface
#{ Interface
def sha(self, as_hex=False):
""":return: sha so far
:param as_hex: if True, sha will be hex-encoded, binary otherwise"""
if as_hex:
return self.sha1.hexdigest()
return self.sha1.digest()
#} END interface
class FlexibleSha1Writer(Sha1Writer):
"""Writer producing a sha1 while passing on the written bytes to the given
write function"""
__slots__ = 'writer'
def __init__(self, writer):
Sha1Writer.__init__(self)
self.writer = writer
def write(self, data):
Sha1Writer.write(self, data)
self.writer(data)
class ZippedStoreShaWriter(Sha1Writer):
"""Remembers everything someone writes to it and generates a sha"""
__slots__ = ('buf', 'zip')
def __init__(self):
Sha1Writer.__init__(self)
self.buf = BytesIO()
self.zip = zlib.compressobj(zlib.Z_BEST_SPEED)
def __getattr__(self, attr):
return getattr(self.buf, attr)
def write(self, data):
alen = Sha1Writer.write(self, data)
self.buf.write(self.zip.compress(data))
return alen
def close(self):
self.buf.write(self.zip.flush())
def seek(self, offset, whence=getattr(os, 'SEEK_SET', 0)):
"""Seeking currently only supports to rewind written data
Multiple writes are not supported"""
if offset != 0 or whence != getattr(os, 'SEEK_SET', 0):
raise ValueError("Can only seek to position 0")
# END handle offset
self.buf.seek(0)
def getvalue(self):
""":return: string value from the current stream position to the end"""
return self.buf.getvalue()
class FDCompressedSha1Writer(Sha1Writer):
"""Digests data written to it, making the sha available, then compress the
data and write it to the file descriptor
**Note:** operates on raw file descriptors
**Note:** for this to work, you have to use the close-method of this instance"""
__slots__ = ("fd", "sha1", "zip")
# default exception
exc = IOError("Failed to write all bytes to filedescriptor")
def __init__(self, fd):
super(FDCompressedSha1Writer, self).__init__()
self.fd = fd
self.zip = zlib.compressobj(zlib.Z_BEST_SPEED)
#{ Stream Interface
def write(self, data):
""":raise IOError: If not all bytes could be written
:return: length of incoming data"""
self.sha1.update(data)
cdata = self.zip.compress(data)
bytes_written = write(self.fd, cdata)
if bytes_written != len(cdata):
raise self.exc
return len(data)
def close(self):
remainder = self.zip.flush()
if write(self.fd, remainder) != len(remainder):
raise self.exc
return close(self.fd)
#} END stream interface
class FDStream(object):
"""A simple wrapper providing the most basic functions on a file descriptor
with the fileobject interface. Cannot use os.fdopen as the resulting stream
takes ownership"""
__slots__ = ("_fd", '_pos')
def __init__(self, fd):
self._fd = fd
self._pos = 0
def write(self, data):
self._pos += len(data)
os.write(self._fd, data)
def read(self, count=0):
if count == 0:
count = os.path.getsize(self._filepath)
# END handle read everything
bytes = os.read(self._fd, count)
self._pos += len(bytes)
return bytes
def fileno(self):
return self._fd
def tell(self):
return self._pos
def close(self):
close(self._fd)
class NullStream(object):
"""A stream that does nothing but providing a stream interface.
Use it like /dev/null"""
__slots__ = tuple()
def read(self, size=0):
return ''
def close(self):
pass
def write(self, data):
return len(data)
#} END W streams

View File

@ -1,4 +0,0 @@
# Copyright (C) 2010, 2011 Sebastian Thiel (byronimo@gmail.com) and contributors
#
# This module is part of GitDB and is released under
# the New BSD License: http://www.opensource.org/licenses/bsd-license.php

View File

@ -1,207 +0,0 @@
# Copyright (C) 2010, 2011 Sebastian Thiel (byronimo@gmail.com) and contributors
#
# This module is part of GitDB and is released under
# the New BSD License: http://www.opensource.org/licenses/bsd-license.php
"""Utilities used in ODB testing"""
from gitdb import OStream
import sys
import random
from array import array
from io import BytesIO
import glob
import unittest
import tempfile
import shutil
import os
import gc
import logging
from functools import wraps
#{ Bases
class TestBase(unittest.TestCase):
"""Base class for all tests
TestCase providing access to readonly repositories using the following member variables.
* gitrepopath
* read-only base path of the git source repository, i.e. .../git/.git
"""
#{ Invvariants
k_env_git_repo = "GITDB_TEST_GIT_REPO_BASE"
#} END invariants
@classmethod
def setUpClass(cls):
try:
super(TestBase, cls).setUpClass()
except AttributeError:
pass
cls.gitrepopath = os.environ.get(cls.k_env_git_repo)
if not cls.gitrepopath:
logging.info(
"You can set the %s environment variable to a .git repository of your choice - defaulting to the gitdb repository", cls.k_env_git_repo)
ospd = os.path.dirname
cls.gitrepopath = os.path.join(ospd(ospd(ospd(__file__))), '.git')
# end assure gitrepo is set
assert cls.gitrepopath.endswith('.git')
#} END bases
#{ Decorators
def skip_on_travis_ci(func):
"""All tests decorated with this one will raise SkipTest when run on travis ci.
Use it to workaround difficult to solve issues
NOTE: copied from bcore (https://github.com/Byron/bcore)"""
@wraps(func)
def wrapper(self, *args, **kwargs):
if 'TRAVIS' in os.environ:
import nose
raise nose.SkipTest("Cannot run on travis-ci")
# end check for travis ci
return func(self, *args, **kwargs)
# end wrapper
return wrapper
def with_rw_directory(func):
"""Create a temporary directory which can be written to, remove it if the
test succeeds, but leave it otherwise to aid additional debugging"""
def wrapper(self):
path = tempfile.mktemp(prefix=func.__name__)
os.mkdir(path)
keep = False
try:
try:
return func(self, path)
except Exception:
sys.stderr.write("Test {}.{} failed, output is at {!r}\n".format(type(self).__name__, func.__name__, path))
keep = True
raise
finally:
# Need to collect here to be sure all handles have been closed. It appears
# a windows-only issue. In fact things should be deleted, as well as
# memory maps closed, once objects go out of scope. For some reason
# though this is not the case here unless we collect explicitly.
if not keep:
gc.collect()
shutil.rmtree(path)
# END handle exception
# END wrapper
wrapper.__name__ = func.__name__
return wrapper
def with_packs_rw(func):
"""Function that provides a path into which the packs for testing should be
copied. Will pass on the path to the actual function afterwards"""
def wrapper(self, path):
src_pack_glob = fixture_path('packs/*')
copy_files_globbed(src_pack_glob, path, hard_link_ok=True)
return func(self, path)
# END wrapper
wrapper.__name__ = func.__name__
return wrapper
#} END decorators
#{ Routines
def fixture_path(relapath=''):
""":return: absolute path into the fixture directory
:param relapath: relative path into the fixtures directory, or ''
to obtain the fixture directory itself"""
return os.path.join(os.path.dirname(__file__), 'fixtures', relapath)
def copy_files_globbed(source_glob, target_dir, hard_link_ok=False):
"""Copy all files found according to the given source glob into the target directory
:param hard_link_ok: if True, hard links will be created if possible. Otherwise
the files will be copied"""
for src_file in glob.glob(source_glob):
if hard_link_ok and hasattr(os, 'link'):
target = os.path.join(target_dir, os.path.basename(src_file))
try:
os.link(src_file, target)
except OSError:
shutil.copy(src_file, target_dir)
# END handle cross device links ( and resulting failure )
else:
shutil.copy(src_file, target_dir)
# END try hard link
# END for each file to copy
def make_bytes(size_in_bytes, randomize=False):
""":return: string with given size in bytes
:param randomize: try to produce a very random stream"""
actual_size = size_in_bytes // 4
producer = range(actual_size)
if randomize:
producer = list(producer)
random.shuffle(producer)
# END randomize
a = array('i', producer)
return a.tobytes()
def make_object(type, data):
""":return: bytes resembling an uncompressed object"""
odata = "blob %i\0" % len(data)
return odata.encode("ascii") + data
def make_memory_file(size_in_bytes, randomize=False):
""":return: tuple(size_of_stream, stream)
:param randomize: try to produce a very random stream"""
d = make_bytes(size_in_bytes, randomize)
return len(d), BytesIO(d)
#} END routines
#{ Stream Utilities
class DummyStream(object):
def __init__(self):
self.was_read = False
self.bytes = 0
self.closed = False
def read(self, size):
self.was_read = True
self.bytes = size
def close(self):
self.closed = True
def _assert(self):
assert self.was_read
class DeriveTest(OStream):
def __init__(self, sha, type, size, stream, *args, **kwargs):
self.myarg = kwargs.pop('myarg')
self.args = args
def _assert(self):
assert self.args
assert self.myarg
#} END stream utilitiess

View File

@ -1,105 +0,0 @@
# Copyright (C) 2010, 2011 Sebastian Thiel (byronimo@gmail.com) and contributors
#
# This module is part of GitDB and is released under
# the New BSD License: http://www.opensource.org/licenses/bsd-license.php
"""Test for object db"""
from gitdb.test.lib import (
TestBase,
DummyStream,
DeriveTest,
)
from gitdb import (
OInfo,
OPackInfo,
ODeltaPackInfo,
OStream,
OPackStream,
ODeltaPackStream,
IStream
)
from gitdb.util import (
NULL_BIN_SHA
)
from gitdb.typ import (
str_blob_type
)
class TestBaseTypes(TestBase):
def test_streams(self):
# test info
sha = NULL_BIN_SHA
s = 20
blob_id = 3
info = OInfo(sha, str_blob_type, s)
assert info.binsha == sha
assert info.type == str_blob_type
assert info.type_id == blob_id
assert info.size == s
# test pack info
# provides type_id
pinfo = OPackInfo(0, blob_id, s)
assert pinfo.type == str_blob_type
assert pinfo.type_id == blob_id
assert pinfo.pack_offset == 0
dpinfo = ODeltaPackInfo(0, blob_id, s, sha)
assert dpinfo.type == str_blob_type
assert dpinfo.type_id == blob_id
assert dpinfo.delta_info == sha
assert dpinfo.pack_offset == 0
# test ostream
stream = DummyStream()
ostream = OStream(*(info + (stream, )))
assert ostream.stream is stream
ostream.read(15)
stream._assert()
assert stream.bytes == 15
ostream.read(20)
assert stream.bytes == 20
# test packstream
postream = OPackStream(*(pinfo + (stream, )))
assert postream.stream is stream
postream.read(10)
stream._assert()
assert stream.bytes == 10
# test deltapackstream
dpostream = ODeltaPackStream(*(dpinfo + (stream, )))
dpostream.stream is stream
dpostream.read(5)
stream._assert()
assert stream.bytes == 5
# derive with own args
DeriveTest(sha, str_blob_type, s, stream, 'mine', myarg=3)._assert()
# test istream
istream = IStream(str_blob_type, s, stream)
assert istream.binsha == None
istream.binsha = sha
assert istream.binsha == sha
assert len(istream.binsha) == 20
assert len(istream.hexsha) == 40
assert istream.size == s
istream.size = s * 2
istream.size == s * 2
assert istream.type == str_blob_type
istream.type = "something"
assert istream.type == "something"
assert istream.stream is stream
istream.stream = None
assert istream.stream is None
assert istream.error is None
istream.error = Exception()
assert isinstance(istream.error, Exception)

View File

@ -1,43 +0,0 @@
# Copyright (C) 2010, 2011 Sebastian Thiel (byronimo@gmail.com) and contributors
#
# This module is part of GitDB and is released under
# the New BSD License: http://www.opensource.org/licenses/bsd-license.php
"""Module with examples from the tutorial section of the docs"""
import os
from gitdb.test.lib import TestBase
from gitdb import IStream
from gitdb.db import LooseObjectDB
from io import BytesIO
class TestExamples(TestBase):
def test_base(self):
ldb = LooseObjectDB(os.path.join(self.gitrepopath, 'objects'))
for sha1 in ldb.sha_iter():
oinfo = ldb.info(sha1)
ostream = ldb.stream(sha1)
assert oinfo[:3] == ostream[:3]
assert len(ostream.read()) == ostream.size
assert ldb.has_object(oinfo.binsha)
# END for each sha in database
# assure we close all files
try:
del(ostream)
del(oinfo)
except UnboundLocalError:
pass
# END ignore exception if there are no loose objects
data = "my data".encode("ascii")
istream = IStream("blob", len(data), BytesIO(data))
# the object does not yet have a sha
assert istream.binsha is None
ldb.store(istream)
# now the sha is set
assert len(istream.binsha) == 20
assert ldb.has_object(istream.binsha)

View File

@ -1,249 +0,0 @@
# Copyright (C) 2010, 2011 Sebastian Thiel (byronimo@gmail.com) and contributors
#
# This module is part of GitDB and is released under
# the New BSD License: http://www.opensource.org/licenses/bsd-license.php
"""Test everything about packs reading and writing"""
from gitdb.test.lib import (
TestBase,
with_rw_directory,
fixture_path
)
from gitdb.stream import DeltaApplyReader
from gitdb.pack import (
PackEntity,
PackIndexFile,
PackFile
)
from gitdb.base import (
OInfo,
OStream,
)
from gitdb.fun import delta_types
from gitdb.exc import UnsupportedOperation
from gitdb.util import to_bin_sha
from nose import SkipTest
import os
import tempfile
#{ Utilities
def bin_sha_from_filename(filename):
return to_bin_sha(os.path.splitext(os.path.basename(filename))[0][5:])
#} END utilities
class TestPack(TestBase):
packindexfile_v1 = (fixture_path('packs/pack-c0438c19fb16422b6bbcce24387b3264416d485b.idx'), 1, 67)
packindexfile_v2 = (fixture_path('packs/pack-11fdfa9e156ab73caae3b6da867192221f2089c2.idx'), 2, 30)
packindexfile_v2_3_ascii = (fixture_path('packs/pack-a2bf8e71d8c18879e499335762dd95119d93d9f1.idx'), 2, 42)
packfile_v2_1 = (fixture_path('packs/pack-c0438c19fb16422b6bbcce24387b3264416d485b.pack'), 2, packindexfile_v1[2])
packfile_v2_2 = (fixture_path('packs/pack-11fdfa9e156ab73caae3b6da867192221f2089c2.pack'), 2, packindexfile_v2[2])
packfile_v2_3_ascii = (
fixture_path('packs/pack-a2bf8e71d8c18879e499335762dd95119d93d9f1.pack'), 2, packindexfile_v2_3_ascii[2])
def _assert_index_file(self, index, version, size):
assert index.packfile_checksum() != index.indexfile_checksum()
assert len(index.packfile_checksum()) == 20
assert len(index.indexfile_checksum()) == 20
assert index.version() == version
assert index.size() == size
assert len(index.offsets()) == size
# get all data of all objects
for oidx in range(index.size()):
sha = index.sha(oidx)
assert oidx == index.sha_to_index(sha)
entry = index.entry(oidx)
assert len(entry) == 3
assert entry[0] == index.offset(oidx)
assert entry[1] == sha
assert entry[2] == index.crc(oidx)
# verify partial sha
for l in (4, 8, 11, 17, 20):
assert index.partial_sha_to_index(sha[:l], l * 2) == oidx
# END for each object index in indexfile
self.failUnlessRaises(ValueError, index.partial_sha_to_index, "\0", 2)
def _assert_pack_file(self, pack, version, size):
assert pack.version() == 2
assert pack.size() == size
assert len(pack.checksum()) == 20
num_obj = 0
for obj in pack.stream_iter():
num_obj += 1
info = pack.info(obj.pack_offset)
stream = pack.stream(obj.pack_offset)
assert info.pack_offset == stream.pack_offset
assert info.type_id == stream.type_id
assert hasattr(stream, 'read')
# it should be possible to read from both streams
assert obj.read() == stream.read()
streams = pack.collect_streams(obj.pack_offset)
assert streams
# read the stream
try:
dstream = DeltaApplyReader.new(streams)
except ValueError:
# ignore these, old git versions use only ref deltas,
# which we havent resolved ( as we are without an index )
# Also ignore non-delta streams
continue
# END get deltastream
# read all
data = dstream.read()
assert len(data) == dstream.size
# test seek
dstream.seek(0)
assert dstream.read() == data
# read chunks
# NOTE: the current implementation is safe, it basically transfers
# all calls to the underlying memory map
# END for each object
assert num_obj == size
def test_pack_index(self):
# check version 1 and 2
for indexfile, version, size in (self.packindexfile_v1, self.packindexfile_v2):
index = PackIndexFile(indexfile)
self._assert_index_file(index, version, size)
# END run tests
def test_pack(self):
# there is this special version 3, but apparently its like 2 ...
for packfile, version, size in (self.packfile_v2_3_ascii, self.packfile_v2_1, self.packfile_v2_2):
pack = PackFile(packfile)
self._assert_pack_file(pack, version, size)
# END for each pack to test
@with_rw_directory
def test_pack_entity(self, rw_dir):
pack_objs = list()
for packinfo, indexinfo in ((self.packfile_v2_1, self.packindexfile_v1),
(self.packfile_v2_2, self.packindexfile_v2),
(self.packfile_v2_3_ascii, self.packindexfile_v2_3_ascii)):
packfile, version, size = packinfo
indexfile, version, size = indexinfo
entity = PackEntity(packfile)
assert entity.pack().path() == packfile
assert entity.index().path() == indexfile
pack_objs.extend(entity.stream_iter())
count = 0
for info, stream in zip(entity.info_iter(), entity.stream_iter()):
count += 1
assert info.binsha == stream.binsha
assert len(info.binsha) == 20
assert info.type_id == stream.type_id
assert info.size == stream.size
# we return fully resolved items, which is implied by the sha centric access
assert not info.type_id in delta_types
# try all calls
assert len(entity.collect_streams(info.binsha))
oinfo = entity.info(info.binsha)
assert isinstance(oinfo, OInfo)
assert oinfo.binsha is not None
ostream = entity.stream(info.binsha)
assert isinstance(ostream, OStream)
assert ostream.binsha is not None
# verify the stream
try:
assert entity.is_valid_stream(info.binsha, use_crc=True)
except UnsupportedOperation:
pass
# END ignore version issues
assert entity.is_valid_stream(info.binsha, use_crc=False)
# END for each info, stream tuple
assert count == size
# END for each entity
# pack writing - write all packs into one
# index path can be None
pack_path1 = tempfile.mktemp('', "pack1", rw_dir)
pack_path2 = tempfile.mktemp('', "pack2", rw_dir)
index_path = tempfile.mktemp('', 'index', rw_dir)
iteration = 0
def rewind_streams():
for obj in pack_objs:
obj.stream.seek(0)
# END utility
for ppath, ipath, num_obj in zip((pack_path1, pack_path2),
(index_path, None),
(len(pack_objs), None)):
iwrite = None
if ipath:
ifile = open(ipath, 'wb')
iwrite = ifile.write
# END handle ip
# make sure we rewind the streams ... we work on the same objects over and over again
if iteration > 0:
rewind_streams()
# END rewind streams
iteration += 1
with open(ppath, 'wb') as pfile:
pack_sha, index_sha = PackEntity.write_pack(pack_objs, pfile.write, iwrite, object_count=num_obj)
assert os.path.getsize(ppath) > 100
# verify pack
pf = PackFile(ppath)
assert pf.size() == len(pack_objs)
assert pf.version() == PackFile.pack_version_default
assert pf.checksum() == pack_sha
pf.close()
# verify index
if ipath is not None:
ifile.close()
assert os.path.getsize(ipath) > 100
idx = PackIndexFile(ipath)
assert idx.version() == PackIndexFile.index_version_default
assert idx.packfile_checksum() == pack_sha
assert idx.indexfile_checksum() == index_sha
assert idx.size() == len(pack_objs)
idx.close()
# END verify files exist
# END for each packpath, indexpath pair
# verify the packs thoroughly
rewind_streams()
entity = PackEntity.create(pack_objs, rw_dir)
count = 0
for info in entity.info_iter():
count += 1
for use_crc in range(2):
assert entity.is_valid_stream(info.binsha, use_crc)
# END for each crc mode
# END for each info
assert count == len(pack_objs)
entity.close()
def test_pack_64(self):
# TODO: hex-edit a pack helping us to verify that we can handle 64 byte offsets
# of course without really needing such a huge pack
raise SkipTest()

View File

@ -1,164 +0,0 @@
# Copyright (C) 2010, 2011 Sebastian Thiel (byronimo@gmail.com) and contributors
#
# This module is part of GitDB and is released under
# the New BSD License: http://www.opensource.org/licenses/bsd-license.php
"""Test for object db"""
from gitdb.test.lib import (
TestBase,
DummyStream,
make_bytes,
make_object,
fixture_path
)
from gitdb import (
DecompressMemMapReader,
FDCompressedSha1Writer,
LooseObjectDB,
Sha1Writer,
MemoryDB,
IStream,
)
from gitdb.util import hex_to_bin
import zlib
from gitdb.typ import (
str_blob_type
)
import tempfile
import os
from io import BytesIO
class TestStream(TestBase):
"""Test stream classes"""
data_sizes = (15, 10000, 1000 * 1024 + 512)
def _assert_stream_reader(self, stream, cdata, rewind_stream=lambda s: None):
"""Make stream tests - the orig_stream is seekable, allowing it to be
rewound and reused
:param cdata: the data we expect to read from stream, the contents
:param rewind_stream: function called to rewind the stream to make it ready
for reuse"""
ns = 10
assert len(cdata) > ns - 1, "Data must be larger than %i, was %i" % (ns, len(cdata))
# read in small steps
ss = len(cdata) // ns
for i in range(ns):
data = stream.read(ss)
chunk = cdata[i * ss:(i + 1) * ss]
assert data == chunk
# END for each step
rest = stream.read()
if rest:
assert rest == cdata[-len(rest):]
# END handle rest
if isinstance(stream, DecompressMemMapReader):
assert len(stream.data()) == stream.compressed_bytes_read()
# END handle special type
rewind_stream(stream)
# read everything
rdata = stream.read()
assert rdata == cdata
if isinstance(stream, DecompressMemMapReader):
assert len(stream.data()) == stream.compressed_bytes_read()
# END handle special type
def test_decompress_reader(self):
for close_on_deletion in range(2):
for with_size in range(2):
for ds in self.data_sizes:
cdata = make_bytes(ds, randomize=False)
# zdata = zipped actual data
# cdata = original content data
# create reader
if with_size:
# need object data
zdata = zlib.compress(make_object(str_blob_type, cdata))
typ, size, reader = DecompressMemMapReader.new(zdata, close_on_deletion)
assert size == len(cdata)
assert typ == str_blob_type
# even if we don't set the size, it will be set automatically on first read
test_reader = DecompressMemMapReader(zdata, close_on_deletion=False)
assert test_reader._s == len(cdata)
else:
# here we need content data
zdata = zlib.compress(cdata)
reader = DecompressMemMapReader(zdata, close_on_deletion, len(cdata))
assert reader._s == len(cdata)
# END get reader
self._assert_stream_reader(reader, cdata, lambda r: r.seek(0))
# put in a dummy stream for closing
dummy = DummyStream()
reader._m = dummy
assert not dummy.closed
del(reader)
assert dummy.closed == close_on_deletion
# END for each datasize
# END whether size should be used
# END whether stream should be closed when deleted
def test_sha_writer(self):
writer = Sha1Writer()
assert 2 == writer.write("hi".encode("ascii"))
assert len(writer.sha(as_hex=1)) == 40
assert len(writer.sha(as_hex=0)) == 20
# make sure it does something ;)
prev_sha = writer.sha()
writer.write("hi again".encode("ascii"))
assert writer.sha() != prev_sha
def test_compressed_writer(self):
for ds in self.data_sizes:
fd, path = tempfile.mkstemp()
ostream = FDCompressedSha1Writer(fd)
data = make_bytes(ds, randomize=False)
# for now, just a single write, code doesn't care about chunking
assert len(data) == ostream.write(data)
ostream.close()
# its closed already
self.failUnlessRaises(OSError, os.close, fd)
# read everything back, compare to data we zip
fd = os.open(path, os.O_RDONLY | getattr(os, 'O_BINARY', 0))
written_data = os.read(fd, os.path.getsize(path))
assert len(written_data) == os.path.getsize(path)
os.close(fd)
assert written_data == zlib.compress(data, 1) # best speed
os.remove(path)
# END for each os
def test_decompress_reader_special_case(self):
odb = LooseObjectDB(fixture_path('objects'))
mdb = MemoryDB()
for sha in (b'888401851f15db0eed60eb1bc29dec5ddcace911',
b'7bb839852ed5e3a069966281bb08d50012fb309b',):
ostream = odb.stream(hex_to_bin(sha))
# if there is a bug, we will be missing one byte exactly !
data = ostream.read()
assert len(data) == ostream.size
# Putting it back in should yield nothing new - after all, we have
dump = mdb.store(IStream(ostream.type, ostream.size, BytesIO(data)))
assert dump.hexsha == sha
# end for each loose object sha to test

View File

@ -1,100 +0,0 @@
# Copyright (C) 2010, 2011 Sebastian Thiel (byronimo@gmail.com) and contributors
#
# This module is part of GitDB and is released under
# the New BSD License: http://www.opensource.org/licenses/bsd-license.php
"""Test for object db"""
import tempfile
import os
from gitdb.test.lib import TestBase
from gitdb.util import (
to_hex_sha,
to_bin_sha,
NULL_HEX_SHA,
LockedFD
)
class TestUtils(TestBase):
def test_basics(self):
assert to_hex_sha(NULL_HEX_SHA) == NULL_HEX_SHA
assert len(to_bin_sha(NULL_HEX_SHA)) == 20
assert to_hex_sha(to_bin_sha(NULL_HEX_SHA)) == NULL_HEX_SHA.encode("ascii")
def _cmp_contents(self, file_path, data):
# raise if data from file at file_path
# does not match data string
with open(file_path, "rb") as fp:
assert fp.read() == data.encode("ascii")
def test_lockedfd(self):
my_file = tempfile.mktemp()
orig_data = "hello"
new_data = "world"
with open(my_file, "wb") as my_file_fp:
my_file_fp.write(orig_data.encode("ascii"))
try:
lfd = LockedFD(my_file)
lockfilepath = lfd._lockfilepath()
# cannot end before it was started
self.failUnlessRaises(AssertionError, lfd.rollback)
self.failUnlessRaises(AssertionError, lfd.commit)
# open for writing
assert not os.path.isfile(lockfilepath)
wfd = lfd.open(write=True)
assert lfd._fd is wfd
assert os.path.isfile(lockfilepath)
# write data and fail
os.write(wfd, new_data.encode("ascii"))
lfd.rollback()
assert lfd._fd is None
self._cmp_contents(my_file, orig_data)
assert not os.path.isfile(lockfilepath)
# additional call doesn't fail
lfd.commit()
lfd.rollback()
# test reading
lfd = LockedFD(my_file)
rfd = lfd.open(write=False)
assert os.read(rfd, len(orig_data)) == orig_data.encode("ascii")
assert os.path.isfile(lockfilepath)
# deletion rolls back
del(lfd)
assert not os.path.isfile(lockfilepath)
# write data - concurrently
lfd = LockedFD(my_file)
olfd = LockedFD(my_file)
assert not os.path.isfile(lockfilepath)
wfdstream = lfd.open(write=True, stream=True) # this time as stream
assert os.path.isfile(lockfilepath)
# another one fails
self.failUnlessRaises(IOError, olfd.open)
wfdstream.write(new_data.encode("ascii"))
lfd.commit()
assert not os.path.isfile(lockfilepath)
self._cmp_contents(my_file, new_data)
# could test automatic _end_writing on destruction
finally:
os.remove(my_file)
# END final cleanup
# try non-existing file for reading
lfd = LockedFD(tempfile.mktemp())
try:
lfd.open(write=False)
except OSError:
assert not os.path.exists(lfd._lockfilepath())
else:
self.fail("expected OSError")
# END handle exceptions

View File

@ -1,10 +0,0 @@
# Copyright (C) 2010, 2011 Sebastian Thiel (byronimo@gmail.com) and contributors
#
# This module is part of GitDB and is released under
# the New BSD License: http://www.opensource.org/licenses/bsd-license.php
"""Module containing information about types known to the database"""
str_blob_type = b'blob'
str_commit_type = b'commit'
str_tree_type = b'tree'
str_tag_type = b'tag'

View File

@ -1,398 +0,0 @@
# Copyright (C) 2010, 2011 Sebastian Thiel (byronimo@gmail.com) and contributors
#
# This module is part of GitDB and is released under
# the New BSD License: http://www.opensource.org/licenses/bsd-license.php
import binascii
import os
import mmap
import sys
import time
import errno
from io import BytesIO
from smmap import (
StaticWindowMapManager,
SlidingWindowMapManager,
SlidingWindowMapBuffer
)
# initialize our global memory manager instance
# Use it to free cached (and unused) resources.
mman = SlidingWindowMapManager()
# END handle mman
import hashlib
try:
from struct import unpack_from
except ImportError:
from struct import unpack, calcsize
__calcsize_cache = dict()
def unpack_from(fmt, data, offset=0):
try:
size = __calcsize_cache[fmt]
except KeyError:
size = calcsize(fmt)
__calcsize_cache[fmt] = size
# END exception handling
return unpack(fmt, data[offset: offset + size])
# END own unpack_from implementation
#{ Aliases
hex_to_bin = binascii.a2b_hex
bin_to_hex = binascii.b2a_hex
# errors
ENOENT = errno.ENOENT
# os shortcuts
exists = os.path.exists
mkdir = os.mkdir
chmod = os.chmod
isdir = os.path.isdir
isfile = os.path.isfile
rename = os.rename
dirname = os.path.dirname
basename = os.path.basename
join = os.path.join
read = os.read
write = os.write
close = os.close
fsync = os.fsync
def _retry(func, *args, **kwargs):
# Wrapper around functions, that are problematic on "Windows". Sometimes
# the OS or someone else has still a handle to the file
if sys.platform == "win32":
for _ in range(10):
try:
return func(*args, **kwargs)
except Exception:
time.sleep(0.1)
return func(*args, **kwargs)
else:
return func(*args, **kwargs)
def remove(*args, **kwargs):
return _retry(os.remove, *args, **kwargs)
# Backwards compatibility imports
from gitdb.const import (
NULL_BIN_SHA,
NULL_HEX_SHA
)
#} END Aliases
#{ compatibility stuff ...
class _RandomAccessBytesIO(object):
"""Wrapper to provide required functionality in case memory maps cannot or may
not be used. This is only really required in python 2.4"""
__slots__ = '_sio'
def __init__(self, buf=''):
self._sio = BytesIO(buf)
def __getattr__(self, attr):
return getattr(self._sio, attr)
def __len__(self):
return len(self.getvalue())
def __getitem__(self, i):
return self.getvalue()[i]
def __getslice__(self, start, end):
return self.getvalue()[start:end]
def byte_ord(b):
"""
Return the integer representation of the byte string. This supports Python
3 byte arrays as well as standard strings.
"""
try:
return ord(b)
except TypeError:
return b
#} END compatibility stuff ...
#{ Routines
def make_sha(source=''.encode("ascii")):
"""A python2.4 workaround for the sha/hashlib module fiasco
**Note** From the dulwich project """
try:
return hashlib.sha1(source)
except NameError:
import sha
sha1 = sha.sha(source)
return sha1
def allocate_memory(size):
""":return: a file-protocol accessible memory block of the given size"""
if size == 0:
return _RandomAccessBytesIO(b'')
# END handle empty chunks gracefully
try:
return mmap.mmap(-1, size) # read-write by default
except EnvironmentError:
# setup real memory instead
# this of course may fail if the amount of memory is not available in
# one chunk - would only be the case in python 2.4, being more likely on
# 32 bit systems.
return _RandomAccessBytesIO(b"\0" * size)
# END handle memory allocation
def file_contents_ro(fd, stream=False, allow_mmap=True):
""":return: read-only contents of the file represented by the file descriptor fd
:param fd: file descriptor opened for reading
:param stream: if False, random access is provided, otherwise the stream interface
is provided.
:param allow_mmap: if True, its allowed to map the contents into memory, which
allows large files to be handled and accessed efficiently. The file-descriptor
will change its position if this is False"""
try:
if allow_mmap:
# supports stream and random access
try:
return mmap.mmap(fd, 0, access=mmap.ACCESS_READ)
except EnvironmentError:
# python 2.4 issue, 0 wants to be the actual size
return mmap.mmap(fd, os.fstat(fd).st_size, access=mmap.ACCESS_READ)
# END handle python 2.4
except OSError:
pass
# END exception handling
# read manully
contents = os.read(fd, os.fstat(fd).st_size)
if stream:
return _RandomAccessBytesIO(contents)
return contents
def file_contents_ro_filepath(filepath, stream=False, allow_mmap=True, flags=0):
"""Get the file contents at filepath as fast as possible
:return: random access compatible memory of the given filepath
:param stream: see ``file_contents_ro``
:param allow_mmap: see ``file_contents_ro``
:param flags: additional flags to pass to os.open
:raise OSError: If the file could not be opened
**Note** for now we don't try to use O_NOATIME directly as the right value needs to be
shared per database in fact. It only makes a real difference for loose object
databases anyway, and they use it with the help of the ``flags`` parameter"""
fd = os.open(filepath, os.O_RDONLY | getattr(os, 'O_BINARY', 0) | flags)
try:
return file_contents_ro(fd, stream, allow_mmap)
finally:
close(fd)
# END assure file is closed
def sliding_ro_buffer(filepath, flags=0):
"""
:return: a buffer compatible object which uses our mapped memory manager internally
ready to read the whole given filepath"""
return SlidingWindowMapBuffer(mman.make_cursor(filepath), flags=flags)
def to_hex_sha(sha):
""":return: hexified version of sha"""
if len(sha) == 40:
return sha
return bin_to_hex(sha)
def to_bin_sha(sha):
if len(sha) == 20:
return sha
return hex_to_bin(sha)
#} END routines
#{ Utilities
class LazyMixin(object):
"""
Base class providing an interface to lazily retrieve attribute values upon
first access. If slots are used, memory will only be reserved once the attribute
is actually accessed and retrieved the first time. All future accesses will
return the cached value as stored in the Instance's dict or slot.
"""
__slots__ = tuple()
def __getattr__(self, attr):
"""
Whenever an attribute is requested that we do not know, we allow it
to be created and set. Next time the same attribute is reqeusted, it is simply
returned from our dict/slots. """
self._set_cache_(attr)
# will raise in case the cache was not created
return object.__getattribute__(self, attr)
def _set_cache_(self, attr):
"""
This method should be overridden in the derived class.
It should check whether the attribute named by attr can be created
and cached. Do nothing if you do not know the attribute or call your subclass
The derived class may create as many additional attributes as it deems
necessary in case a git command returns more information than represented
in the single attribute."""
pass
class LockedFD(object):
"""
This class facilitates a safe read and write operation to a file on disk.
If we write to 'file', we obtain a lock file at 'file.lock' and write to
that instead. If we succeed, the lock file will be renamed to overwrite
the original file.
When reading, we obtain a lock file, but to prevent other writers from
succeeding while we are reading the file.
This type handles error correctly in that it will assure a consistent state
on destruction.
**note** with this setup, parallel reading is not possible"""
__slots__ = ("_filepath", '_fd', '_write')
def __init__(self, filepath):
"""Initialize an instance with the givne filepath"""
self._filepath = filepath
self._fd = None
self._write = None # if True, we write a file
def __del__(self):
# will do nothing if the file descriptor is already closed
if self._fd is not None:
self.rollback()
def _lockfilepath(self):
return "%s.lock" % self._filepath
def open(self, write=False, stream=False):
"""
Open the file descriptor for reading or writing, both in binary mode.
:param write: if True, the file descriptor will be opened for writing. Other
wise it will be opened read-only.
:param stream: if True, the file descriptor will be wrapped into a simple stream
object which supports only reading or writing
:return: fd to read from or write to. It is still maintained by this instance
and must not be closed directly
:raise IOError: if the lock could not be retrieved
:raise OSError: If the actual file could not be opened for reading
**note** must only be called once"""
if self._write is not None:
raise AssertionError("Called %s multiple times" % self.open)
self._write = write
# try to open the lock file
binary = getattr(os, 'O_BINARY', 0)
lockmode = os.O_WRONLY | os.O_CREAT | os.O_EXCL | binary
try:
fd = os.open(self._lockfilepath(), lockmode, int("600", 8))
if not write:
os.close(fd)
else:
self._fd = fd
# END handle file descriptor
except OSError:
raise IOError("Lock at %r could not be obtained" % self._lockfilepath())
# END handle lock retrieval
# open actual file if required
if self._fd is None:
# we could specify exlusive here, as we obtained the lock anyway
try:
self._fd = os.open(self._filepath, os.O_RDONLY | binary)
except:
# assure we release our lockfile
remove(self._lockfilepath())
raise
# END handle lockfile
# END open descriptor for reading
if stream:
# need delayed import
from gitdb.stream import FDStream
return FDStream(self._fd)
else:
return self._fd
# END handle stream
def commit(self):
"""When done writing, call this function to commit your changes into the
actual file.
The file descriptor will be closed, and the lockfile handled.
**Note** can be called multiple times"""
self._end_writing(successful=True)
def rollback(self):
"""Abort your operation without any changes. The file descriptor will be
closed, and the lock released.
**Note** can be called multiple times"""
self._end_writing(successful=False)
def _end_writing(self, successful=True):
"""Handle the lock according to the write mode """
if self._write is None:
raise AssertionError("Cannot end operation if it wasn't started yet")
if self._fd is None:
return
os.close(self._fd)
self._fd = None
lockfile = self._lockfilepath()
if self._write and successful:
# on windows, rename does not silently overwrite the existing one
if sys.platform == "win32":
if isfile(self._filepath):
remove(self._filepath)
# END remove if exists
# END win32 special handling
os.rename(lockfile, self._filepath)
# assure others can at least read the file - the tmpfile left it at rw--
# We may also write that file, on windows that boils down to a remove-
# protection as well
chmod(self._filepath, int("644", 8))
else:
# just delete the file so far, we failed
remove(lockfile)
# END successful handling
#} END utilities

View File

@ -1,18 +0,0 @@
def force_bytes(data, encoding="utf-8"):
if isinstance(data, bytes):
return data
if isinstance(data, str):
return data.encode(encoding)
return data
def force_text(data, encoding="utf-8"):
if isinstance(data, str):
return data
if isinstance(data, bytes):
return data.decode(encoding)
return str(data, encoding)

View File

@ -1,152 +0,0 @@
jdcal
=====
.. _TPM: http://www.sal.wisc.edu/~jwp/astro/tpm/tpm.html
.. _Jeffrey W. Percival: http://www.sal.wisc.edu/~jwp/
.. _IAU SOFA: http://www.iausofa.org/
.. _pip: https://pypi.org/project/pip/
.. _easy_install: https://setuptools.readthedocs.io/en/latest/easy_install.html
.. image:: https://travis-ci.org/phn/jdcal.svg?branch=master
:target: https://travis-ci.org/phn/jdcal
This module contains functions for converting between Julian dates and
calendar dates.
A function for converting Gregorian calendar dates to Julian dates, and
another function for converting Julian calendar dates to Julian dates
are defined. Two functions for the reverse calculations are also
defined.
Different regions of the world switched to Gregorian calendar from
Julian calendar on different dates. Having separate functions for Julian
and Gregorian calendars allow maximum flexibility in choosing the
relevant calendar.
Julian dates are stored in two floating point numbers (double). Julian
dates, and Modified Julian dates, are large numbers. If only one number
is used, then the precision of the time stored is limited. Using two
numbers, time can be split in a manner that will allow maximum
precision. For example, the first number could be the Julian date for
the beginning of a day and the second number could be the fractional
day. Calculations that need the latter part can now work with maximum
precision.
All the above functions are "proleptic". This means that they work for
dates on which the concerned calendar is not valid. For example,
Gregorian calendar was not used prior to around October 1582.
A function to test if a given Gregorian calendar year is a leap year is
also defined.
Zero point of Modified Julian Date (MJD) and the MJD of 2000/1/1
12:00:00 are also given as module level constants.
Examples
--------
Some examples are given below. For more information see
https://oneau.wordpress.com/2011/08/30/jdcal/.
Gregorian calendar:
.. code-block:: python
>>> from jdcal import gcal2jd, jd2gcal
>>> gcal2jd(2000,1,1)
(2400000.5, 51544.0)
>>> 2400000.5 + 51544.0 + 0.5
2451545.0
>>> gcal2jd(2000,2,30)
(2400000.5, 51604.0)
>>> gcal2jd(2000,3,1)
(2400000.5, 51604.0)
>>> gcal2jd(2001,2,30)
(2400000.5, 51970.0)
>>> gcal2jd(2001,3,2)
(2400000.5, 51970.0)
>>> jd2gcal(*gcal2jd(2000,1,1))
(2000, 1, 1, 0.0)
>>> jd2gcal(*gcal2jd(1950,1,1))
(1950, 1, 1, 0.0)
>>> gcal2jd(2000,1,1)
(2400000.5, 51544.0)
>>> jd2gcal(2400000.5, 51544.0)
(2000, 1, 1, 0.0)
>>> jd2gcal(2400000.5, 51544.5)
(2000, 1, 1, 0.5)
>>> jd2gcal(2400000.5, 51544.245)
(2000, 1, 1, 0.24500000000261934)
>>> jd2gcal(2400000.5, 51544.1)
(2000, 1, 1, 0.099999999998544808)
>>> jd2gcal(2400000.5, 51544.75)
(2000, 1, 1, 0.75)
Julian calendar:
.. code-block:: python
>>> jd2jcal(*jcal2jd(2000, 1, 1))
(2000, 1, 1, 0.0)
>>> jd2jcal(*jcal2jd(-4000, 10, 11))
(-4000, 10, 11, 0.0)
Gregorian leap year:
.. code-block:: python
>>> from jdcal import is_leap
>>> is_leap(2000)
True
>>> is_leap(2100)
False
JD for zero point of MJD, and MJD for JD2000.0:
.. code-block:: python
>>> from jdcal import MJD_0, MJD_JD2000
>>> print MJD_0
2400000.5
>>> print MJD_JD2000
51544.5
Installation
------------
The module can be installed using `pip`_ or `easy_install`_::
$ pip install jdcal
or,
::
$ easy_install jdcal
Tests are in ``test_jdcal.py``.
Credits
--------
1. A good amount of the code is based on the excellent `TPM`_ C library
by `Jeffrey W. Percival`_.
2. The inspiration to split Julian dates into two numbers came from the
`IAU SOFA`_ C library. No code or algorithm from the SOFA library is
used in `jdcal`.
License
-------
Released under BSD; see LICENSE.txt.
For comments and suggestions, email to user `prasanthhn` in the `gmail.com`
domain.

Some files were not shown because too many files have changed in this diff Show More